[comp.unix.questions] Hard links vs. Soft links

anagram@desire.wright.edu ((For Mongo)) (08/23/90)

What is the difference between a hard link and a soft link?  Besides the fact
that a hard link seems to make a copy of the file, while the soft link just
points the OS to the real file.  In broader terms, my question is this: I have
a Tektronix 4301 that has the commands ls, ll, lf, lg, and lx, all of which are
derivatives or ls.  They are all the same size, and they are all linked
together.  When I had a system error and all the links were destroyes, I
deleted them all, except ls, and re-linked them using soft links.  I saved
about a quarter of a meg of disk-space.  I have come across some other files
that are the same way, and am wondering how much space I can save, compared to
how much system performance I will lose.  Can anyone tell me how soft links vs.
hard links will affect system performance. 

Thanks,
Steve P Potter
Systems Manager
Mission Research Corp

swfc@ulysses.att.com (Shu-Wie F Chen) (08/23/90)

In article <1084.26d2a42b@desire.wright.edu>, anagram@desire.wright.edu
((For Mongo)) writes:
|>What is the difference between a hard link and a soft link?  Besides the fact
|>that a hard link seems to make a copy of the file, while the soft link just
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
A hard link associates another file name to a file.  It does *not* make
a copy. 

|>points the OS to the real file.  In broader terms, my question is
this: I have
|>a Tektronix 4301 that has the commands ls, ll, lf, lg, and lx, all of
which are
|>derivatives or ls.  They are all the same size, and they are all linked
|>together.  When I had a system error and all the links were destroyes, I
|>deleted them all, except ls, and re-linked them using soft links.  I saved
|>about a quarter of a meg of disk-space.  I have come across some other files
|>that are the same way, and am wondering how much space I can save,
compared to
|>how much system performance I will lose.  Can anyone tell me how soft
links vs.
|>hard links will affect system performance. 

The -i option to ls tells you the inode of the file associated with each
file name (note the distinction between file and file name).  You might
want to do a ls -i to see what is really going on.

On a side note, you might want to alias ll, lf, lg, and lx to 'ls -xxx'
instead of keeping separate binaries.  For instance, I have ll aliased
to  ls -lasF.

|>
|>Thanks,
|>Steve P Potter
|>Systems Manager
|>Mission Research Corp
                
You're welcome.

*swfc

jeff@quark.WV.TEK.COM (Jeff Beadles) (08/24/90)

anagram@desire.wright.edu ((For Mongo)) writes:
>What is the difference between a hard link and a soft link?  Besides the fact
>that a hard link seems to make a copy of the file, while the soft link just
>points the OS to the real file.  In broader terms, my question is this: I have
>a Tektronix 4301 that has the commands ls, ll, lf, lg, and lx, all of which are
>derivatives or ls.  They are all the same size, and they are all linked
>together.  When I had a system error and all the links were destroyes, I
>deleted them all, except ls, and re-linked them using soft links.  I saved
>about a quarter of a meg of disk-space.  I have come across some other files
>that are the same way, and am wondering how much space I can save, compared to
>how much system performance I will lose.  Can anyone tell me how soft links vs.
>hard links will affect system performance. 
>
>Thanks,
>Steve P Potter
>Systems Manager
>Mission Research Corp


First, under UTek (a 4.2BSD based Unix), the l{f,g,l,r,s,x} commands are all
linked to the same binary.  The binary then looks at argv[0] to determine what
flags should be set by default.  Here's a 'll -i' of the files in question:

 8166 -r-xr-xr-x  6 sys         54272 Apr 12  1989 /bin/lf
 8166 -r-xr-xr-x  6 sys         54272 Apr 12  1989 /bin/lg
 8166 -r-xr-xr-x  6 sys         54272 Apr 12  1989 /bin/ll
 8166 -r-xr-xr-x  6 sys         54272 Apr 12  1989 /bin/lr
 8166 -r-xr-xr-x  6 sys         54272 Apr 12  1989 /bin/ls
 8166 -r-xr-xr-x  6 sys         54272 Apr 12  1989 /bin/lx

Note, that they all have the same inode number.  Thus, if you did a 'df' in a
quiescent file system, then removed all but one of the 'l*' commands, and did
another 'df', the space used would not change at all.

Symbolic links are slower to follow.  This is because the kernel has to first
get the symbolic link, resolve it to find what it points to, and then resolve
the file that it points to.

By using a hard link, the kernel can immediately resolve it to inode '8166' and
do the right thing.

I just did a little test to see how much this effects things.  I stat(2)'ed a
file 50,000 times.  The first time was stat'ing a plain file, and the second
was a symbolic link that pointed to a file in the same directory.  It was the
same file, so everything should even be cached...

Here's the results:

type		Time(clock)
-----------	----------
plain file:	45.5 seconds
Symbolic link:	94   seconds

This is not a though test procedure, but the results are about what I expected.
FYI, here's test program that I used:

-------------------snip here------------
#include <sys/types.h>
#include <stat.h>

main()
{
	register long count;
	struct stat statbuf;


	count = 0L;

	while ( count++ < 50000L)
		(void)stat("/tmp/stat", &statbuf);
}
-------------------snip here------------

After compiling, I did the following:

% rm /tmp/stat
% touch /tmp/stat
% time stest 
0.1u 45.5s 0:45 99% 0+0k 0+1io 1pf+0w
% rm /tmp/stat
% touch /tmp/foo
% ln -s /tmp/foo /tmp/stat
% time stest
0.1u 93.8s 1:34 99% 0+0k 0+1io 1pf+0w

Overall, if you traverse the links often, then you will see a performance hit.
Symbolic links do have their advantages.  They will span filesystems.

	-Jeff
-- 
Jeff Beadles				jeff@quark.WV.TEK.COM 
Utek Engineering, Tektronix Inc.
			SPEEA - Just say no.

gwyn@smoke.BRL.MIL (Doug Gwyn) (08/24/90)

In article <1084.26d2a42b@desire.wright.edu> anagram@desire.wright.edu ((For Mongo)) writes:
>When I had a system error and all the links were destroyes, I
>deleted them all, except ls, and re-linked them using soft links.

Thereby using up more disk space and slowing down execution.

bob@wyse.wyse.com (Bob McGowen x4312 dept208) (08/24/90)

In article <13646@ulysses.att.com> swfc@ulysses.att.com (Shu-Wie F Chen) writes:
>In article <1084.26d2a42b@desire.wright.edu>, anagram@desire.wright.edu
>((For Mongo)) writes:
---deleted discussion of hard links, symbolic links
>
>On a side note, you might want to alias ll, lf, lg, and lx to 'ls -xxx'
>instead of keeping separate binaries.  For instance, I have ll aliased
		    ^^^^^^^^^^^^^^^^^
Using links of any sort, as you noted, will not create a copy, so no
separate binaries.

>to  ls -lasF.
>

As for using aliases for this function, only csh and ksh (if you have it)
would be able to do this.  By using links pointing to the one file and
letting the program determine the function based on the name used to
call it, sh users can also have this ability.  The particular system
in the original post has this ability.

My system, ls only recognizes the lc alternate name, so I must use
either aliases or a shell script to get the function.

For sh users, the following has worked for me:


	:
	# emulate XENIX style listing commands
	# I used the following in case this happened to get run by
	# csh or ksh, which "remember" the command by its full
	# path name

	BASENAME=`basename $0`
	
	case $BASENAME in
	   l)			# long listing
	      ls -l $*
	   ;;
	   ll)			# long listing, BSD(?) style
	      ls -l $*
	   ;;
	   lf)			# columns with * and slash
	      ls -CF $*
	   ;;
	   lx)			# columns sorted in rows
	      ls -x $*
	   ;;
	   lr)			# columns, recursively
	      ls -CR $*
	   ;;
	   la)			# columns, all files
	      ls -Ca $*
	   ;;
	esac
Bob McGowan  (standard disclaimer, these are my own ...)
Product Support, Wyse Technology, San Jose, CA
..!uunet!wyse!bob
bob@wyse.com

bob@wyse.wyse.com (Bob McGowen x4312 dept208) (08/24/90)

In article <13646@ulysses.att.com> swfc@ulysses.att.com (Shu-Wie F Chen) writes:
>In article <1084.26d2a42b@desire.wright.edu>, anagram@desire.wright.edu


In my followup to this article, I forgot one critical thing to make the
script I inlcuded work.  Create the file with one of the names that are
selections in the case, then use ln to make hard links of it to each of
the other names.

	vi lf
	ln lf la
	ln lf lx
	ln lf lr
	ln lf l
	ln lf ll
	...

Sorry for the omission.

Bob McGowan  (standard disclaimer, these are my own ...)
Product Support, Wyse Technology, San Jose, CA
..!uunet!wyse!bob
bob@wyse.com

roland@ai.mit.edu (Roland McGrath) (08/24/90)

In article <1084.26d2a42b@desire.wright.edu>, anagram@desire.wright.edu
((For Mongo)) writes:
|>What is the difference between a hard link and a soft link?  Besides the fact
|>that a hard link seems to make a copy of the file, while the soft link just
|>points the OS to the real file.

You have a serious misconception about this.  The way Unix files work is that
each file is described by an i-node, which contains modification times,
permissions, etc., and points to where the data is stored on disk.  Each i-node
on a disk has a unique number.  Thus each file is uniquiely described by its
device number (which says what disk it's on) and its i-node number.

Directories are special files which contain only directory entries.  Directory
entries consist of a name and an i-node number.  Each directory entry which
refers to a given i-node is a hard link to the file that i-node describes.  If
there are multiple hard links to a single file, (which happens when you create
a hard link to an existing file with "ln foo bar"), each link is equivalent.
The first one made is not in any way special; they are all "the real file".

Symbolic (soft) links are a special type of file which contain a path name.
Under most circumstances, when the system encounters a symbolic link, it reads
the path name and uses that instead for whatever operation it was doing.  (The
exceptions are symlink, readlink and lstat, which deal specifically with
creating and inspecting symbolic links.)

|>In broader terms, my question is this: I have a Tektronix 4301 that has the
|>commands ls, ll, lf, lg, and lx, all of which are derivatives or ls.  They
|>are all the same size, and they are all linked together.  When I had a system
|>error and all the links were destroyes, I deleted them all, except ls, and
|>re-linked them using soft links.  I saved about a quarter of a meg of
|>disk-space.

Those were not hard links.  They were copies.  I'm not sure how your system
error made removed the extra links and made new copies of the file with the
same name all by itself.  Sounds more like a human error to me.

|>I have come across some other files that are the same way, and am wondering
|>how much space I can save, compared to how much system performance I will
|>lose.  Can anyone tell me how soft links vs.  hard links will affect system
|>performance.

Hard links will save space and be more efficient.  They are just harder to deal
with, since, although you can find out how many hard links exist to a given
i-node (the link count is the first number in an "ls -l" listing), it is
nontrivial to find out where they all are.

Hard links save space over symlinks because a symlink is another file (though a
small one; its contents are the path name of the file it refers to), including
a directory entry giving the i-node of the symlink, while a hard link is just
the directory entry.  Hard links are more efficient than symlinks because once
the system reads the directory entry for a hard link, it has the i-node number,
which tells it where on the disk to find the file, while after reading the
directory entry for a symlink, the system must then go find its contents on the
disk, and then do name resolution all over again to find the directory entry it
refers to, and then use the i-node number in that directory entry to find the
file on disk (unless, of course, it's a symlink to a symlink, in which case it
has to go through the whole process yet again).

If by "the same way" you mean identical copies, then it's probably a good idea
to remove the extra copies and replace them with links.  Using symlinks will
be easier to deal with, and the wasted space and efficiency is negligible.

If you mean that you have files to which there are multiple hard links, then
the only thing you will gain by replacing the hard links with symlinks is that
you will confuse yourself less.

If you want to know how many links there are to a given file, look at the first
number in an "ls -l" listing.  If you want to know if two files are the same
file, do "ls -i" on them and compare the i-node numbers.
--
	Roland McGrath
	Free Software Foundation, Inc.
roland@ai.mit.edu, uunet!ai.mit.edu!roland