[net.sources.bugs] What happens during an unlink

edward@ukecc.UUCP (Edward C. Bennett) (04/27/86)

In article <979@kitty.UUCP>, larry@kitty.UUCP writes:
>In article <403@ukecc.UUCP>, edward@ukecc.UUCP (Edward C. Bennett) writes:
>>
>>       But you can recover an unlinked file! I know, I've had to do it.
>> You must unmount the file system and search the free list for your data.
>> It's a PITA, but worth it if you lose something big.
>
>        I don't claim to be a UNIX internals expert (I have enough trouble
> writing I/O drivers :-) ), but don't most ports of UNIX zero disk blocks after
> an unlink(2)?  As I seem to recall, unlink(2) is derived from unlink.s, which
> is assembly language specific for the given machine.  And unlink.s contains
> a routine _unlink which fills the disk blocks with .word defined as 0x0000.

        When you unlink a file, the disk block addresses in the inode are
zeroed, not the actual data blocks. Zeroing the inode is tantamount to
'forgetting' where the data is actually stored. The 'forgotten' data
blocks are added to the (top of the) free list to be used when adding data
to the filesystem.
        When the old data blocks are placed in the free list, their contents
are unimportant. Zeroing the blocks would be a waste of time since whatever
is there will be overwritten when the blocks are allocated to another file.
This situation is identical to the malloc()/free() process. i.e. Space
that has been malloc()ed will have data written into it. When this space
is free()ed, the data in it is not altered and is useable until the next
malloc().

--
Edward C. Bennett

UUCP: ihnp4!cbosgd!ukma!ukecc!edward

Kentucky: The state that is being dragged, kicking and screaming,
          into the 20th century.

"Goodnight M.A."

wcs@ho95e.UUCP (#Bill_Stewart) (04/30/86)

In article <422@ukecc.UUCP> edward@ukecc.UUCP (Edward C. Bennett) writes:
>
>>.....[stuff about what happens to blocks when you unlink the file]...
>        When the old data blocks are placed in the free list, their contents
>are unimportant. Zeroing the blocks would be a waste of time since whatever
>is there will be overwritten when the blocks are allocated to another file.
>This situation is identical to the malloc()/free() process. i.e. Space
>that has been malloc()ed will have data written into it. When this space
>is free()ed, the data in it is not altered and is useable until the next
>malloc().

Please don't depend on this behaviour!  Under the older mallocs (including
System V and 4.1BSD), you could depend on it, but systems with newer and
fancier mallocs don't promise it.  In particular, System V Release 2 offers
two malloc routines.  The default is the old malloc, but the new -lmalloc
library provides routines that offer better performance under some kinds of
input, and the man page says you can't trust freed-up memory after a free()
or realloc().

In general, the safety of using freed space depends on the implementation of
the data structures chaining space together.
-- 
# Bill Stewart, AT&T Bell Labs 2G-202, Holmdel NJ 1-201-949-0705 ihnp4!ho95c!wcs

eric@chronon.UUCP (Eric Black) (05/01/86)

In article <422@ukecc.UUCP> edward@ukecc.UUCP (Edward C. Bennett) writes:
>
>In article <979@kitty.UUCP>, larry@kitty.UUCP writes:
>>
>>        I don't claim to be a UNIX internals expert (I have enough trouble
>> writing I/O drivers :-) ), but don't most ports of UNIX zero disk blocks after
>> an unlink(2)?  As I seem to recall, unlink(2) is derived from unlink.s, which
>> is assembly language specific for the given machine.  And unlink.s contains
>> a routine _unlink which fills the disk blocks with .word defined as 0x0000.
>
>        When you unlink a file, the disk block addresses in the inode are
>zeroed, not the actual data blocks. Zeroing the inode is tantamount to
>'forgetting' where the data is actually stored. The 'forgotten' data
>blocks are added to the (top of the) free list to be used when adding data
>to the filesystem.
>        When the old data blocks are placed in the free list, their contents
>are unimportant. Zeroing the blocks would be a waste of time since whatever
>is there will be overwritten when the blocks are allocated to another file.
>This situation is identical to the malloc()/free() process. i.e. Space
>that has been malloc()ed will have data written into it. When this space
>is free()ed, the data in it is not altered and is useable until the next
>malloc().

Some unitory systems do, indeed, zero out disk blocks when de-allocated,
and similarly clear memory when freed.  Any system you sell to customers
with concerns about security will require this.  Check out DOD requirements
for secure systems in the "Department of Defense Trusted Computer
System Evaluation Criteria", publication CSC-STD-001-83 (my copy is
dated March 1985) for this and other interesting features...

Spooks aren't the only people who might desire disks & memory to be
cleansed when released, by the way.

* "unitory" is a trademark of absolutely nobody, and I like it!

-- 
Eric Black   "Garbage In, Gospel Out"
UUCP:        {sun,pyramid,hplabs,amdcad}!chronon!eric
WELL:        eblack
BIX:         eblack

barmar@mit-eddie.MIT.EDU (Barry Margolin) (05/02/86)

In article <238@chronon.chronon.UUCP> eric@chronon.UUCP (Eric Black) writes:
>Some unitory systems do, indeed, zero out disk blocks when de-allocated,
>and similarly clear memory when freed.  Any system you sell to customers
>with concerns about security will require this.  Check out DOD requirements
>for secure systems in the "Department of Defense Trusted Computer
>System Evaluation Criteria", publication CSC-STD-001-83 (my copy is
>dated March 1985) for this and other interesting features...

I don't have my copy of the Criteria handy, but I don't believe that it
requires zeroing of freed disk blocks (I'm pretty sure that we don't
zero freed disk blocks on Multics, and we are rated B2).  What it
requires is that the old data not be accessible upon reuse.  A freed
disk block will never be paged into memory, and when it is reused it
will be completely overwritten by the memory frame being paged out.  And
an unused physical memory frame will be zeroed before being allocated
into the page table (but not if the frame is being allocated to hold a
disk page being read in).

Working from memory, I think the only requirement about zeroing has to
do with removable media.  The system must be able to completely destroy
the data upon request.  For example, we have a tape drive operation
(called "data security erase", I think) that overwrites every record of
the tape several times, to make sure that that no residual data can be
detected.
-- 
    Barry Margolin
    ARPA: barmar@MIT-Multics
    UUCP: ..!genrad!mit-eddie!barmar

edward@ukecc.UUCP (Edward C. Bennett) (05/05/86)

In article <238@chronon.chronon.UUCP>, eric@chronon.UUCP (Eric Black) writes:
> >
> >	[discussion of what unlink(2) does]
> 
> Some unitory systems do, indeed, zero out disk blocks when de-allocated,
> and similarly clear memory when freed.  Any system you sell to customers
> with concerns about security will require this.  Check out DOD requirements
> for secure systems in the "Department of Defense Trusted Computer
> System Evaluation Criteria", publication CSC-STD-001-83 (my copy is
> dated March 1985) for this and other interesting features...
> 
> Spooks aren't the only people who might desire disks & memory to be
> cleansed when released, by the way.
> 

	You're absolutely right. I never though about that way.

-- 
Edward C. Bennett

UUCP: ihnp4!cbosgd!ukma!ukecc!edward

Kentucky: The state that is being dragged, kicking and screaming,
	  into the 20th century.

"Goodnight M.A."

jgy@hropus.UUCP (jgy) (05/07/86)

Someone said:
> When you unlink a file, the disk block addresses in the inode are
> zeroed, not the actual data blocks. Zeroing the inode is tantamount to
> 'forgetting' where the data is actually stored. ..............

This is not true, all that is necessary is that the blocks be put on the
freelist, the inode marked unallocated and added to the inode freelist.
If this were the case you could just go and look at your unallocated inode
for the block information. The onus would be on the system to clear the
inode before being reused. The only possible dispute I can see with this
is (problems with a crash can be handled) that of who should be
"charged" with clearing of someone else's dirty inode!

levy@ttrdc.UUCP (Daniel R. Levy) (05/08/86)

In article <438@ukecc.UUCP>, edward@ukecc.UUCP (Edward C. Bennett) writes:
>In article <238@chronon.chronon.UUCP>, eric@chronon.UUCP (Eric Black) writes:
>> >	[discussion of what unlink(2) does]
>> Some unitory systems do, indeed, zero out disk blocks when de-allocated,
>> and similarly clear memory when freed.  Any system you sell to customers
>> with concerns about security will require this.  Check out DOD requirements
>> for secure systems in the "Department of Defense Trusted Computer
>> System Evaluation Criteria", publication CSC-STD-001-83 (my copy is
>> dated March 1985) for this and other interesting features...
>> Spooks aren't the only people who might desire disks & memory to be
>> cleansed when released, by the way.
>	You're absolutely right. I never though about that way.
>Edward C. Bennett

Hmmmm.  Maybe there should be an option to 'rm' to cause it to zero out
files before unlinking them?  (like rm -e [for erase], similar to VMS's
DELETE/ERASE)

I don't see however, why it would matter whether memory is zeroed upon
release, as long as it gets zeroed before reallocation by an ordinary user
(and accesses fail, e.g., with a "bus error," if one is trying to read or
write outside of one's allocated range).  After all, if you're the administrator
and can look at the memory contents you can spy on running processes anyway.
-- 
 -------------------------------    Disclaimer:  The views contained herein are
|       dan levy | yvel nad      |  my own and are not at all those of my em-
|         an engihacker @        |  ployer or the administrator of any computer
| at&t computer systems division |  upon which I may hack.
|        skokie, illinois        |
 --------------------------------   Path: ..!{akgua,homxb,ihnp4,ltuxa,mvuxa,
						vax135}!ttrdc!levy

dave@onfcanim.UUCP (Dave Martindale) (05/08/86)

In article <442@hropus.UUCP> jgy@hropus.UUCP (jgy) writes:
>Someone said:
>> When you unlink a file, the disk block addresses in the inode are
>> zeroed, not the actual data blocks. Zeroing the inode is tantamount to
>> 'forgetting' where the data is actually stored. ..............
>
>This is not true, all that is necessary is that the blocks be put on the
>freelist, the inode marked unallocated and added to the inode freelist.
>If this were the case you could just go and look at your unallocated inode
>for the block information. The onus would be on the system to clear the
>inode before being reused. The only possible dispute I can see with this
>is (problems with a crash can be handled) that of who should be
>"charged" with clearing of someone else's dirty inode!

There is still a way of getting back most of the data of a just-deleted
file on an inactive filesystem: the data blocks have just been put onto
the freelist, and 99/100 of them still have their original data in them.
Just poke through the first blocks of the free list to get your data.

If someone has already allocated the blocks to a new file (by writing on
them), tough luck.  This generally will happen within a few seconds, so
there seems little point in writing out inodes with the block pointers
still intact - it will seldom do you much good.

But you have to modify the inode on disk anyway, to make sure it is marked
free in case a crash happens, and if you're going to modify even one bit
of it you might as well zero the whole thing - the cost is primarily
in the disk I/O and copying.

pdg@ihdev.UUCP (P. D. Guthrie) (05/08/86)

In article <861@ttrdc.UUCP> levy@ttrdc.UUCP (Daniel R. Levy) writes:
>In article <438@ukecc.UUCP>, edward@ukecc.UUCP (Edward C. Bennett) writes:
>>In article <238@chronon.chronon.UUCP>, eric@chronon.UUCP (Eric Black) writes:
>>> >	[discussion of what unlink(2) does]
>>> Some unitory systems do, indeed, zero out disk blocks when de-allocated,
>>> and similarly clear memory when freed.  Any system you sell to customers
>>> with concerns about security will require this.  Check out DOD requirements
>>> for secure systems in the "Department of Defense Trusted Computer
>>> System Evaluation Criteria", publication CSC-STD-001-83 (my copy is
>>> dated March 1985) for this and other interesting features...
>>> Spooks aren't the only people who might desire disks & memory to be
>>> cleansed when released, by the way.
>>	You're absolutely right. I never though about that way.
>>Edward C. Bennett
>
>Hmmmm.  Maybe there should be an option to 'rm' to cause it to zero out
>files before unlinking them?  (like rm -e [for erase], similar to VMS's
>DELETE/ERASE)
>
The trouble with this is that is really would have to be an option to
unlink(2), which would make a lot of current software obsolete.  The
only other way would be to have rm directly write to disk,  but there is
too much margin for error or mass destruction here.
>I don't see however, why it would matter whether memory is zeroed upon
>release, as long as it gets zeroed before reallocation by an ordinary user
>(and accesses fail, e.g., with a "bus error," if one is trying to read or
>write outside of one's allocated range).  After all, if you're the administrator
>and can look at the memory contents you can spy on running processes anyway.
>-- 
Pretty much true on a UNIX system, although zeroing memory does make it
harder to spy, but those DOD requirements are generic for all trusted
computer systems, and there are others where it would make moer sense.

> -------------------------------    Disclaimer:  The views contained herein are
>|       dan levy | yvel nad      |  my own and are not at all those of my em-
>|         an engihacker @        |  ployer or the administrator of any computer
>| at&t computer systems division |  upon which I may hack.
>|        skokie, illinois        |
> --------------------------------   Path: ..!{akgua,homxb,ihnp4,ltuxa,mvuxa,
>						vax135}!ttrdc!levy


-- 

Paul Guthrie		`See the happy moron, he doesn't give a damn.
ihnp4!ihdev!pdg		 I wish I were a moron. My God! Perhaps I am.'

levy@ttrdc.UUCP (Daniel R. Levy) (05/10/86)

In article <634@ihdev.UUCP>, pdg@ihdev.UUCP (P. D. Guthrie) writes:
>In article <861@ttrdc.UUCP> levy@ttrdc.UUCP (Daniel R. Levy) writes:
>>In article <438@ukecc.UUCP>, edward@ukecc.UUCP (Edward C. Bennett) writes:
>>>In article <238@chronon.chronon.UUCP>, eric@chronon.UUCP (Eric Black) writes:
>>>> >	[discussion of what unlink(2) does]
>>>> Spooks aren't the only people who might desire disks & memory to be
>>>> cleansed when released, by the way.
>>>	You're absolutely right. I never though about that way.
>>>Edward C. Bennett
>>Hmmmm.  Maybe there should be an option to 'rm' to cause it to zero out
>>files before unlinking them?  (like rm -e [for erase], similar to VMS's
>>DELETE/ERASE)
>>
>The trouble with this is that is really would have to be an option to
>unlink(2), which would make a lot of current software obsolete.  The
>only other way would be to have rm directly write to disk,  but there is
>too much margin for error or mass destruction here.

Why either problem? The hypothetical 'rm -e' would check to see if the
file was an ordinary file, and if so try to open() it.  If it succeeded,
it would write nulls all the way to the end (rm does a stat() anyway
so that the file length is easy to get) then blow it away with an
unlink() (or maybe vice versa, so that if it were interrupted in mid-zeroing,
which would purposely be made difficult to do by ignoring interrupts, quits,
and hangups during the zeroing process, at least there wouldn't be a partially
munged file left).  No mass destruction possible (at least not worse than the
present rm), since ordinary user calls are being used, no setuid or direct
device access stuff.  If the file could be unlinked but not opened for
writing, an attempt to 'rm -e' would warn the user about this [as it does
now if the user cannot write the file] and ask for confirmation before
unlinking sans zeroing out.

Main disadvantage would be slowness.  Things could be speeded up (for the
user) by forking a nohup'ed background process to finish off the zeroing
out for each file (but a 'rm -e *' might run up against a process limit
problem, so maybe for big jobs this backgrounding would not happen or
be done in small groups of processes).

Directory files would be a different problem.  They could be "zeroed"
from user code by filling them with empty files like "a", "b", etc.
after deleting the contents, then unlinking those empty files, then doing
the rmdir.  Device files would not need zeroing, obviously (except maybe
for fifos--does the last data to go down them before close stay on disk?).
-- 
 -------------------------------    Disclaimer:  The views contained herein are
|       dan levy | yvel nad      |  my own and are not at all those of my em-
|         an engihacker @        |  ployer or the administrator of any computer
| at&t computer systems division |  upon which I may hack.
|        skokie, illinois        |
 --------------------------------   Path: ..!{akgua,homxb,ihnp4,ltuxa,mvuxa,
						vax135}!ttrdc!levy

Makey@LOGICON.ARPA (Jeff Makey) (05/10/86)

Seen in article <438@ukecc.UUCP> by "Edward C. Bennett" <edward@ukecc.uucp>:

>In article <238@chronon.chronon.UUCP>, eric@chronon.UUCP (Eric Black) writes:
>> >
>> >     [discussion of what unlink(2) does]
>>
>> Some unitory systems do, indeed, zero out disk blocks when de-allocated,
>> and similarly clear memory when freed.  Any system you sell to customers
>> with concerns about security will require this.  Check out DOD requirements
>> for secure systems in the "Department of Defense Trusted Computer
>> System Evaluation Criteria", publication CSC-STD-001-83 (my copy is
>> dated March 1985) for this and other interesting features...

To prevent any misconceptions, it should be noted that CSC-STD-001-83
does not specifically require disk space or memory to be cleared when
freed, or when allocated, or that it be written to before you read
from it.  However, unless the system in question enforces at least
*one* of these strategies it will most likely fail CSC-STD-001-83's
"Object Reuse" requirement.

                         :: Jeff Makey
                            Makey@LOGICON.ARPA

P.S.  Copies of CSC-STD-001-83 dated March 1985 can be considered
      collector's items.  That cover date is a misprint and only a few
      hundred of them were distributed.  The only correct cover date is
      15 August 1983.

greg@utcsri.UUCP (Gregory Smith) (05/12/86)

In article <634@ihdev.UUCP> pdg@ihdev.UUCP (55224-P. D. Guthrie) writes:
>In article <861@ttrdc.UUCP> levy@ttrdc.UUCP (Daniel R. Levy) writes:
>>Hmmmm.  Maybe there should be an option to 'rm' to cause it to zero out
>>files before unlinking them?  (like rm -e [for erase], similar to VMS's
>>DELETE/ERASE)
>>
>The trouble with this is that is really would have to be an option to
>unlink(2), which would make a lot of current software obsolete.  The
>only other way would be to have rm directly write to disk,  but there is
>too much margin for error or mass destruction here.

How about having unlink(2) look in the environment, and if it sees 'UNLINK=
zerofiles', it zeroes the storage. Relatively few calls to unlink arise from
the user typing 'rm something', so the '-e' option would be somewhat
insufficient. This way, you could get all of them without the software
obsoletion problem.

-- 
"Canabee be said2b or not2b anin tire b, if half thabee isnotabee, due2
somain chunt injury?" - Eric's Dilemma
----------------------------------------------------------------------
Greg Smith     University of Toronto      UUCP: ..utzoo!utcsri!greg

jdptxt@adiron.UUCP (The Mad Hacker) (05/19/86)

Several problems occur if one wants to zero out the file blocks upon
unlinking them.  What if the file that I try to remove has more than one
link to it?  Should unlink be successful but the zero-ing not be done or
should the zero-ing be done regardless or should unlink fail altogether?
Also, it is possible to rm a file that you have no access to and even one
that you don't own.  An attempt to write zeros to such a file should fail
but it is not a problem to rm it even if it is the last link to the file.
(We have a project here which heavily depends on the idea that an rm on a
file does not change the contents of the disk blocks that contained those
files because someone else may actually own the file or have a use for the
link to the file.)

If you want an rm to zero the file blocks and then unlink the file, then
go ahead and write one but please don't ask that everyone install it


					jdp

jerry@oliveb.UUCP (Jerry Aguirre) (05/20/86)

Regarding the proposal for an option to "rm" or unlink(2) to zero
discarded disk blocks for security reasons.

I hope that "rm" or unlink(2) is going to check st_nlinks before doing
this.  Remember that removing a file will only deallocate the data if
the name being removed is the LAST link to the file.

Before 4.2BSD rename(2) the only way to rename a file was to link(2) a
new name and then unlink(2) the old one.  I would hate to have files
zeroed everytime I did a "mv".

					Jerry Aguirre @ Olivetti ATC
{hplabs|fortune|idi|ihnp4|tolerant|allegra|glacier|olhqma}!oliveb!jerry

bzs@bu-cs.UUCP (Barry Shein) (05/23/86)

Why would the following not work? It just takes file names and zeros
out the named files. One could remove them later, you could write a
trivial shell file to do both.

	-Barry Shein, Boston University
-----
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>

/*
 * zf file...
 *
 * scribble zeros onto a file or files
 *
 * This program is UNIX specific in that you have to find out
 * if the underlying system reallocates and copies updated blocks
 * on the disk or re-writes in place (that is, if you are zeroing
 * the file for security reasons.)
 *
 * Barry Shein, Boston University
 */

/*
 * If PORTABLE not defined uses some UNIX assumptions. On a non-UNIX system
 * you will probably have to provide the call to stat for size of the file
 * and, either remove the check for file type or provide that also.
 * That is, the program is not necessarily portable but I pointed out
 * what you would need to fix to port it to your system.
 */
/* #define PORTABLE */

char *prog;
char buf[BUFSIZ]; /* Assumes externs are zeroed, otherwise zero buf in main */
main(argc,argv) int argc; char **argv;
{
  FILE *fp;
  struct stat stbuf;

  prog = *argv;

/* uncomment and maybe provide bzero() if needed */
/*
  bzero(buf,sizeof(buf));
 */

  if(argc < 2) {
    fprintf(stderr,"Usage: %s file[s]\n",prog);
    exit(1);
  }
  ++argv;
  --argc;

  for(; argc > 0; ++argv, --argc) {
    /* get size and file type */
    if(stat(*argv,&stbuf) < 0) {
      perror(*argv);
      continue;
    }
    /* open for update from beginning of file */
    if((fp = fopen(*argv,"r+")) == 0) {
      perror(*argv);
      continue;
    }

    /* only zero certain file types, customize as needed */
    /* UNIX specific probably */
    switch(stbuf.st_mode & S_IFMT) {

    case S_IFREG:
    case S_IFCHR:
    case S_IFBLK:
      break;

    default:
      fprintf(stderr,"%s/%s: must be regular or char or block special\n",
	      prog,*argv);
      continue;
    }
    /* end UNIX specific probably */

    zfile(fp,stbuf.st_size);
    fclose(fp);
  }
  exit(0);
}
/*
 * zfile - zero all bytes in a file
 */
#ifdef PORTABLE
zfile(fp,size) FILE *fp; off_t size;
{
  while(size > 0) {
    fwrite(buf,sizeof(buf[0]),BUFSIZ > size ? size : BUFSIZ,fp);
    size -= BUFSIZ;
  }
}
#else
/*
 * This version of zfile will be faster on UNIX systems
 */
zfile(fp,size) FILE *fp; off_t size;
{
  while(size > 0) {
    write(fileno(fp),buf,BUFSIZ > size ? size : BUFSIZ);
    size -= BUFSIZ;
  }
}
#endif

tp@ndm20 (05/24/86)

The problem of  zeroing a  file no  longer in  use is  tricky in unix
because the user has no way to delete a file.  rm  simply unlinks it,
i.e.  removes it from a directory.   Others have  mentioned that this
does not imply that there are no other links, and indeed  in order to
rm a file, you do not need any permission to read or  write it, since
an rm is a function applied to the  containing directory  and not the
file itself.  What is needed  is code  in the  kernel to  zero a file
after the last link is removed.  The kernel implicitly deletes a file
with no links.  This is the only time  the zeroing  could take place.
Of  course  systems  not  worried  about  security  wouldn't want the
overhead.  Maybe this should be a configuration parameter.  Of course
it would be nice if it were  filesystem dependent,  then you wouldn't
have to have the overhead on "non-secure" file systems.