I don't know of any way that an ordinary user could parlay the ability
to make hard links to a directory into obtaining superuser status.
But that is not the only reason why some system calls are restricted.
A foolish user could create loops in the directory structure.
Lots of file system functions depend on the absence of loops in
order to guarantee completion. Some system calls would never return.
Wiliiam Lewis> So I suppose it's probably a security issue somehow
Denial of service is sometimes considered a security issue,
and sometimes considered just a matter of proper administration.
Choose your own taxonomy of admin nightmares.
Aaron
I quote from the original Ritchie and Thompson paper:
The directory structure is constrained to have the form of a
rooted tree. Except for the special entries ``.'' and ``..P'',
each directory must appear as an entry in exactly one other
directory, which is its parent. The reason for this is to
simplify the writing of programs that visit subtrees of the
directory structure, and more important, to avoid the
separation of portions of the hierarchy. If arbitrary links to
directories were permitted, it would be quite difficult to
detect when the last connection from the root to a directory
was severed.
No need for excess paranoia...
> In the man entry for ln(1) (and for link(2)), it says that
>hard links may not be made to directories, unless the linker is
>the super-user (in order to make '.' and '..', I suppose). My
>question is: why not?
What should be the output of the last command of this sequence?
mkdir dir1
mkdir dir2
mkdir dir1/a_directory
ln dir1/a_directroy dir2/directory_link
cd dir2/directory_link
cd ..
pwd
--
Jeff Mulligan (j...@eos.arc.nasa.gov)
NASA/Ames Research Ctr., Mail Stop 262-2, Moffet Field CA, 94035
(415) 604-3745
I don't think the reason has anything to do with security. My understanding
of the problem is that it would allow the creation of circularly linked
structures that the crude reference count scheme of UNIX could not deal
with. UNIX assumes that the only circular linkages in the directory tree
are . and .., and the special services "mkdir" and "rmdir" take care of
these special cases.
The Cambridge CAP file system demonstrated quite effectively that it is
perfectly possible to allow users to create arbitrary linkages between
directories and files, but this worked only because they invented a nice
combination of garbage collection and reference counts to handle the
problem of reclaiming circularly linked grabage.
The CAP file system was capability based, but the decision to allow
directories to be circularly linked is independent of the access control
mechanisms they used on their capabilities. In UNIX terms, the CAP
file system can be thought of as having an access rights field like that
in each UNIX I-node, but this was stored in the link to the file, so each
link could confer different access rights to the file.
Doug Jones
jo...@herky.cs.uiowa.edu
According to Ritchie and Thompson [1]:
The directory structure is constrained to have the form of a
rooted tree. Except for the special entries "." and "..",
each directory must appear as an entry in exactly one other
directory, which is its parent. The reason for this is to
simplify the writing of programs that visit subtrees of the
directory structure, and more important, to avoid the
separation of portions of the hierarchy. If arbitrary links
to directories were permitted, it would be quite difficult
to detect when the last connection from the root to a
directory was severed.
And according to Thompson [2]:
The file system structure allows an arbitrary, directed graph
of directories with regular files linked in at arbitrary places
in this graph. In fact, very early UNIX systems used such a
structure. Administration of such a structure became so
chaotic that later systems were restricted to a directory tree.
Both references are from the _Bell System Technical Journal_,
Vol. 57, No. 6, Part 2 (July-August 1978), special issue:
"UNIX Time-Sharing System."
[1] D. M. Ritchie and K. Thompson, "The UNIX Time-Sharing System."
Page 1909.
[2] K. Thompson, "UNIX Implementation." Page 1942.
-- Speaking strictly for myself,
-- Lee Derbenwick, AT&T Bell Laboratories, Warren, NJ
-- l...@cbnewsm.ATT.COM or <wherever>!att!cbnewsm!lfd
Why, dir1 and dir2 of course! Considering BOTH are parent directories,
they should be allowed to have the same kid - right? Any ideas?
Wouldn't this be a NEAT addition to UNIX?
-Just an Idea,
Kartik
subbarao@{phoenix,bogey or gauguin}.Princeton.EDU -|Internet
kar...@silvertone.Princeton.EDU (NeXT mail) -|
subb...@pucc.Princeton.EDU - Bitnet
The big (and I mean REAL BBBBIIIIGGGG) problem with hard linking directories
is that find does not know how to recognize and handle them. When find
processes a file system it actually cd's to each directory and then cds to ..
to go back. When you have two directories linked to gether a cd to .. in
either directory will always go to the same parent directory. If both
are at the same exact place in the file system you would be ok, but if
they are at different levels (different paths (other than basename) find
will end up skipping some of your file system.
Now you might say that you don't care that much about find. That is, you
might say this until you realize that find is used as a main portion of the
backup scheme on many systems, so your backups will get screwed up.
Anyway, that is one problem. There probably are others with equally
disasterous results.
>that's the user's priviledge. So I suppose it's probably a security
>issue somehow (restrictions of this sort seem to be). Hence the
>crosspost to alt.security.
>--
>wi...@blake.acs.washington.edu Seattle, Washington | No sig under
>(William Lewis) | 47 41' 15" N 122 42' 58" W |||||||| construction
--
Conor P. Cahill (703)430-9247 Virtual Technologies, Inc.,
uunet!virtech!cpcahil 46030 Manekin Plaza, Suite 160
Sterling, VA 22170
And where whould you have .. point if you had multiple links to
a directory? Things would get pretty wierd.
-Ron
This sounds more mysterious than it is. When a_directory is created,
its parent is dir1, and it stays there. When directory_link is
created, all that happens is that the directory dir2 gets a new entry
that names the same inode as a_directory. The a_directory inode is NOT
changed (except that its reference count increases), so its parent is
still dir1. So,
pwd == dir1
in any unix I know of.
--
== Mark Warren Bull HN Information Systems Inc. ==
== (508) 294-3171 (FAX 671-3020) 300 Concord Road MS820A ==
== mwa...@granite.cr.bull.com Billerica, MA 01821 ==
a_directory contains entries '.' and '..' that point to itself
and its parent, dir1, respectively.
|ln dir1/a_directroy dir2/directory_link
|cd dir2/directory_link
|cd ..
|pwd
If you shell is csh, it could perhaps tell you "dir2". That's because
csh has 'pwd' builtin and keeps track of the current directory itself
in $cwd; it is also confused by symbolic links *). But a proper pwd
(try /bin/pwd) should always give you "dir1" (as absolute path, of
course), since that is where '..' points at.
Leo.
*) This behaviour of C shell is corrected in more recent versions.
Outside of the fact that "dump" *should* be the way to go to back up
systems (religious issue -- flames to Email!), the problem with hard-
linking directories is indeed a security issue at one point.
Consider the user who knows that he is chroot(2)ed somewhere. If he could,
via another account, make a hard link to somewhere upward of his chroot(2)ed
point (assuming that his new root is not the root of a separate file system)
then he could access things he wasn't meant to.
Another claim is somewhere in the rename(2) man page:
"CAVEAT
The system can deadlock if a loop in the file system graph
is present. This loop takes the form of an entry in direc-
tory "a", say "a/foo", being a hard link to directory "b",
and an entry in directory "b", say "b/bar", being a hard
link to directory "a". When such a loop exists and two
separate processes attempt to perform "rename a/foo b/bar"
and "rename b/bar a/foo", respectively, the system may dead-
lock attempting to lock both directories for modification.
Hard links to directories should be replaced by symbolic
links by the system administrator."
>>that's the user's priviledge. So I suppose it's probably a security
>>issue somehow (restrictions of this sort seem to be). Hence the
>>crosspost to alt.security.
Well, it IS the user's privilege to make up a convoluted directory struc-
ture in his own namespace, but using symbolic links. They're much more
easy to try and resolve, since you don't have to do an ncheck to find out
which directories have such-and-such an inode.
Now, WHY a user would need to make a namespace convoluted escapes me, but
the world is full of oddities, now, ain't it?
>>--
>>wi...@blake.acs.washington.edu Seattle, Washington | No sig under
>>(William Lewis) | 47 41' 15" N 122 42' 58" W |||||||| construction
>
>
>--
>Conor P. Cahill (703)430-9247 Virtual Technologies, Inc.,
>uunet!virtech!cpcahil 46030 Manekin Plaza, Suite 160
> Sterling, VA 22170
--
-- once bitten, twice shy, thrice stupid --
MORALITY IS THE BIGGEST DETRIMENT TO OPEN COMMUNICATION.
/earth: minimum percentage of free space changes from 10% to 0%
should optimize for space^H^H^H^H^Hintelligence with minfree < 10%
This problem doesnot seem so crucial for symbolic links. This leads to
the question what makes the problem crucial in one case and not in the
other? Another question is when is it more appropriate to use one or the
other?
--
Franck BOISSIERE bois...@irisa.irisa.fr
Prototyping Lab Manager bois...@ccettix.UUCP
C.C.E.T.T. B.P. 59 boissier%irisa.i...@uunet.uu.net
35512 CESSON SEVIGNE CEDEX FRANCE
> [...] The reason for this is to
> simplify the writing of programs that visit subtrees of the
> directory structure, and more important, to avoid the
> separation of portions of the hierarchy. [...]
Exactly!
For those of us who have no symlinks, hard-linking to a directory
(if possible) can sometimes avoid lot's of headaches. If you don't
create circular links, the only "surprise" may be that ".." sometimes
is not what you normally expect. Minor problems may occur with find and
some backup programs, which may duplicate parts of the disk on the backup
media and hence use more space than expected or available.
BTW: A few years back I wrote a "directory compression program" which
only linked around the files to get rid of empty slots in directories,
that were large some day but shrunk in size later. (This *can* be
done with standard commands, but not in the most efficient way.)
If the directory which was to compress contained sub-directories,
things became a bit complicated, because the ".." entry of the
sub-directories had to be re-linked ... all in all it's a nice
exercise for students who want really to understand how the directory-
hierachies is implemented under unix :-)
--
Martin Weitzel, email: mar...@mwtech.UUCP, voice: 49-(0)6151-6 56 83
JBM> What should be the output of the last command of this sequence?
JBM> mkdir dir1
JBM> mkdir dir2
JBM> mkdir dir1/a_directory
JBM> ln dir1/a_directroy dir2/directory_link
JBM> cd dir2/directory_link
JBM> cd ..
JBM> pwd
dir1:
- dir2/directory_link/.. is a link to dir1 that existed before
dir1/a_directory was linked to dir2/directory_link
The fact that with symlinks, the real (hard) link is given a higher status
than the symbolic link. So, for example, `find' can ignore symlinks and
follow hard links. With two hard links, there's no local strategy (i.e.
a strategy whose action on small portions of the filesystem is determined
only by characteristics of that small portion) which follows only one,
except for otherwise distinguishable cases like `.' and `..'.
>Another question is when is it more appropriate to use one or the other?
Well, you have to use symlinks for directories. Often links are used to
rearrange filesystems, like on workstations running sunos prior to v4 where
/usr is a shared filesystem but /usr/spool is private, so /usr/spool is a
symlink to /private/usr/spool; even if this were a plain file it couldn't be a
hard link because it crosses filesystems. So, quite frequently you don't have
a choice.
When you do have a choice, I recommend symlinks for anything under maintenance,
because it's too easy to have hard links broken by moving files around. It's
also very convenient that tar preserves symlinks.
I tentatively think it's good that things like the link from /bin/e to
/bin/ed are hard links, because it saves one block of disk space on tens of
thousands of machines, therefore saving tens of thousands of blocks. On the
other hand, I've heard that Dennis Ritchie recommended that hard links (other
than . and ..) be removed when symlinks were added, so that there weren't two
fundamental ways to do something, and that sounds reasonable to me.
ajr
Imagine the fun a user could have with the following:
% ln . foo
% ln .. bar
It would annoy a lot of the utilities you might like to run, like
du, ls -R, etc.
>It seems perfectly harmless to me, although
>it would allow the user to make a pretty convoluted directory structure,
>that's the user's priviledge. So I suppose it's probably a security
>issue somehow (restrictions of this sort seem to be). Hence the
>crosspost to alt.security.
Well, perhaps the following could be hazardous:
# rm -r bar
Just a thought.
--
Paul Shields shi...@nccn.yorku.ca
P.S: on VAX/VMS 3.7 the above (with a different command set of course)
is possible. I don't know about old versions of UNIX.
The commands that existed in VMS 3.7 still existed in VMS 5.3. I
don't know if they will still let you create cycles in the directory
structure. I do know they were used to share files in a VAXcluster
because VMS didn't have symbolic links....
This hard linking on VMS has caused lots of trouble since it normally
isn't done. BACKUP assumes that all files have one link (basically)
so restoring a disk that was backed up that had odd directory entries
like this caused the files to be duplicated, rather than re-linked.
This was only a problem for one type of backup (used to do
incrementals), not the full image (level 0) backups.
Don't know if they fixed it, but it sounds like a "denial of service"
security hole when the original disks are tight on space. In
addition, the way that it was implemented caused a hole whereby
certain programs could read files that a user couldn't normally read.
Plan files with finger springs to mind......
Warner
--
Warner Losh i...@Solbourne.COM
Boycott Lotus. #include <std/disclaimer>
Sure, tell find not to follow a directory if the inode of "foo/.." is not
the inode of ".". (i.e., treat it as a symbolic link)
--
Peter da Silva. `-_-'
+1 713 274 5180. 'U`
<pe...@ficc.ferranti.com>
Now, why could we not have .. point to a table which listed all the directories
above it?
grey...@unisoft.UUCP (The Grey Wolf) writes:
[.. good advice about chroot security and rename deadlock.. ]
>Well, it IS the user's privilege to make up a convoluted directory struc-
>ture in his own namespace, but using symbolic links. They're much more
>easy to try and resolve, since you don't have to do an ncheck to find out
>which directories have such-and-such an inode.
>Now, WHY a user would need to make a namespace convoluted escapes me, but
>the world is full of oddities, now, ain't it?
Convolutions such as cycles aside,
the reason would be to store things non-hierarchically.
Many sites place source in /usr/local/src, but some use
/usr/src/local. Or put man pages in /usr/local/man vs. /usr/man/local.
as also /usr/bin/local and /usr/local/bin. And why is /usr in front
of all of these? It makes no logical sense.
Hierarchical organization for this is more tradition than real. And
it gets more convoluted with /usr/local/lib/tex/bin, ad nauseum...
people begin to have diffculty finding their way around. I provide
links to reduce the impact of guessing.
But the only use I ever found for cyclic directory structure was to
annoy system administrators.
Paul Shields
shi...@nccn.yorku.ca
On sane Unix systems, ln fails if the target file exists already. On
AT&T System V UNIX(R) Operating Systems, it silently goes ahead. Some
faceless imbecile in the hordes of System V UNIX(R) Operating System
developers thought it would be cute if ln, mv, and cp all worked the
same way.
--
NFS: all the nice semantics of MSDOS, | Henry Spencer at U of Toronto Zoology
and its performance and security too. | he...@zoo.toronto.edu utzoo!henry
as everyone has said, you don't want hard links to dirs
because the tree'ness of the file system gets buggered.
on the other hand, links to files already do that to some extent.
and symbolic links do it completely as you can symlink to directories.
allowing hard links to dirs makes the problem no worse, really.
and in fact, it may help because maybe then all the tree traversal code
everyone implements on their own will get buggered enough that everyone
will use the one routine (say ftw) which can do it right (as best it can, anyway).
the hard program to write and define semantics for has always been pwd.
some unixes remember how you got to a directory and use that knowledge
to interpret .. while most look up the file system. it all depends on what you want.
Well, now I get to ask the next question ...
My [ second ] favorite question is why doesn't the SunOS ln command
permit the use of the -f flag for blasting an existent target flag?
Before answering that question, remember that USG's stupid behavior
existed before Sun's and that in the business world, one stupid
decision deserves another ;-)
--
John F. Haugh II UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 832-8832 Domain: j...@rpp386.cactus.org
on the other hand, links to files already do that to some extent.
and symbolic links do it completely as you can symlink to directories.
allowing hard links to dirs makes the problem no worse, really.
Actually, it makes it much worse. Symbolic links can be quite easily
detected by any program that is concerned about circular filesystem
structures, since a symbolic link has a different file type from the
file to which it points.
However, there is no way for a program to easily detect, without
keeping lots of internal state showing which inodes have already been
visited, when it is reading a hard link. Hard links are no different
from normal files.
For example, "find" has no trouble at all dealing with symbolic links,
but it can quite easily get into a looping state if it encounters a
hard-linked directory pointing higher up in the directory structure.
Jonathan Kamens USnail:
MIT Project Athena 11 Ashford Terrace
j...@Athena.MIT.EDU Allston, MA 02134
Office: 617-253-8495 Home: 617-782-0710
Because the (4.3)BSD "ln" command doesn't seem to, either:
auspex% cp /home/unix_src/bsd4.3/bin/ln.c .
auspex% cc -o ln ln.c
auspex% echo >foo
auspex% echo >bar
auspex% ./ln -f foo bar
bar: File exists
and because the command sequence
rm -f bar && ln -f foo bar
for example, would have done the job quite nicely....
>as everyone has said, you don't want hard links to dirs
>because the tree'ness of the file system gets buggered.
>on the other hand, links to files already do that to some extent.
Links to files don't `bugger' the file system's structure,
because they don't create loops that make tree traversal
ambiguous. It's still a tree, with some leaves fused
together.
>and symbolic links do it completely as you can symlink to directories.
>allowing hard links to dirs makes the problem no worse, really.
fsck can ignore symbolic links or treat them as ordinary files.
It can't ignore hard links. Hard links confuse the system;
symbolic links needn't.
--
Chuck Karish kar...@mindcraft.com
Mindcraft, Inc. (415) 323-9000
You're missing the entire purpose of the question - the question isn't
what is the behavior, but rather, why isn't the behavior something else.
That two different behaviors exist is obvious - now why hasn't anyone
bothered to add an option to select which one you get?
Another interesting question is why doesn't this work -
Script is typescript, started Sun Jul 22 23:16:41 1990
#rpp386-> pwd
/dev
#rpp386-> ls -l null
crw-rw-rw- 1 sysinfo sysinfo 4, 2 Jul 22 23:12 null
#rpp386-> mknod barf c 4 2
#rpp386-> mv barf /usr/tmp
#rpp386-> ls -l /usr/tmp/barf
-rw-r--r-- 1 root root 0 Jul 22 23:16 /usr/tmp/barf
#rpp386-> rm /usr/tmp/barf
#rpp386-> exit
Script done Sun Jul 22 23:17:04 1990
My manual says the MV command renames files. What was so hard about
renaming /dev/barf to /usr/tmp/barf? And before someone protests that
moving devices around is unusual, it also doesn't work for named pipes.
In fact, the behavior for renaming a named pipe is so far off it's quite
disgusting.
Older versions of mv couldn't move files between partitions at all,
since rename() doesn't work between partitions. Newer versions first
attempt to use rename() to change the name of a file, and if that
fails because the rename() attempted to rename a file to a different
partition, than it uses the same method as cp to copy the file.
Therefore, if /usr/tmp and /dev are on different partitions on your
system, then the rename() is going to fail, so mv is going to create a
file of the specified name in /usr/tmp, then open /dev/null for read
and copy its "contents" into the file in /usr/tmp.
The "correct" solution, perhaps, is to make mv check for special
files after the rename() fails, and to use the mknod code to recreate
special files. The problem with this is that it would require mv to
be setuid. Perhaps a better solution would be for mv to notice
special files and print out an error like "cannot rename special files
across partitions"; that is, after all, what it used to print when you
tried to mv regular files across partitions (without the word
"special").
As a matter of fact, there is. Normal files are just hard links, like
you said, but a difference between a 'real directory' and an 'alias
formed by a hard link to a directory' is that the .. of the former is
the natural parent (the directory you get by stripping off the last
path component), while the .. of the latter is not. When traversing trees
it just becomes a matter of following only the natural links (just like
skipping symbolic links the artificial links are skipped).
|
|For example, "find" has no trouble at all dealing with symbolic links,
|but it can quite easily get into a looping state if it encounters a
|hard-linked directory pointing higher up in the directory structure.
No need for that if find only - recursively - follows those
subdirectories 'sub' for which the inode of 'sub/..' is the same as
that of '.' (I wouldn't be surprised if there was already something of
the same tenor in the source of find).
Leo.
Why can't potential loops be detected when a hard link request is given?
I.e., if I request to link directory `a' to directory `b', the
kernal could look up the tree to see if `a' already occurs as a parent
to `b', and deny the request if it does (ELOOPDEELOOP).
--
This will only hold true as long as ".." always points to the parent
directory of the first reference created. This is true in most
implementations of Unix, but it does not *have* to be true, and in
addition, the POSIX standard doesn't require it. Indeed, POSIX doesn't
even require that directory entries "." and ".." actually exist in the
filesystem; all it requires is that when a program accesses "." and
"..", it will get the current and parent directories respectively. If
the operating system chooses to interpret "parent directory" as "the one
parent directory out of the several possible ones which was taken in
this particular case to get here," it is allowed to do that. Indeed, I
would find that interpretation more useful than "the directory in which
the child directory was first created, no matter which parent you went
through to get to it this time."
Relying on the way most filesystems work now is not a good idea,
especially when the standard says they don't have to work that way.
|> No need for that if find only - recursively - follows those
|> subdirectories 'sub' for which the inode of 'sub/..' is the same as
|> that of '.' (I wouldn't be surprised if there was already something
of
|> the same tenor in the source of find).
I'm fairly certain that no such special code exists in find. The
string ".." appears only once in the source code for find, and that's
when it'd doing a chdir(".."), so I don't think there's any code to
check the inode numbers of ...
I'd test it, but whenever I try to create a hard link to a directory
as root, my workstation hangs or crashes. I wonder why :-)
Unfortunately, this does not quite work as desired:
% mkdir a b; ln b a/b
(now a/b is the same as b, and a/b/.. is the same as ./..)
% unlink b # (rmdir would complain)
(now b is gone; only a/b names the inode that has a/b/.. the same as ./..)
% touch a/b/somefile
% find . -print
Since a/b/.. is not the `natural parent' a/b, find prints `a a/b'. The
file `somefile' has been `lost'. You can still get to it---but if find
were changed as proposed, it would no longer see it.
The trick here is that symlinks are truly special. In the symlink case:
% mkdir a b; ln -s ../b a/b
(must use `../b' since symlinks are interpreted relative to their place
in the tree)
% rmdir b
% touch a/b/somefile
touch: a/b/somefile: No such file or directory
%
What is going on: a hard link always refers to some particular inode
(the inode is the `real' existence of a file; a directory entry or path
name is simply a way of naming the inode), and that inode hangs around
until there are no references to it. A symlink, on the other hand,
simply contains a string that, when the symlink is used as a part of
a path name, is spliced in `in place' of the symlink itself. (Certain
operations, e.g., lstat, do not splice in the symlink when it is the last
component of a pathname.) Since the symlink does not hold a reference
to the *real* file (directory), when the name `b' is removed the referece
count goes to zero and the file actually goes away. The symlink is then
`orphaned', whereas the hard link `a/b' was simply left around with an
`invalid' `..'. If a new name `b' is created the symlink then resolves
to the new b, whereas the hard link a/b would not.
--
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain: ch...@cs.umd.edu Path: uunet!mimsy!chris
And you cut off the portion of my response in which I said *why* nobody
added that option; the answer is "you can get that behavior without
adding a flag to 'ln' - either remove the target first using a command
that doesn't barf if it's not there, or check if it's there first and
remove it if it is."
... thus defeating the purpose of find, since the user doesn't get
what he expected to get (namely, the entire directory tree descending
from his specified target). At least with symlinks, the user has an
obvious and easy way to determine whether or not a directory entry
will result in find hitting a dead-end: if "ls -l" shows that the
entry represents a symlink, then find won't traverse it. But what
does your scheme really buy that symlinks don't? It seems to me
that your non-primary hard links are just local symlinks in disguise,
except that it would now be even less obvious to the user that such
links are liable to surprise him.
------------------------------------------------------------------------
Bob Goudreau +1 919 248 6231
Data General Corporation
62 Alexander Drive goud...@dg-rtp.dg.com
Research Triangle Park, NC 27709 ...!mcnc!rti!xyzzy!goudreau
USA
Sure it can:
if(stat("subdir/..", &sbuf) != EOF)
if(stat(".", &dbuf) != EOF)
if(sbuf.d_ino != dbuf.d_ino)
type = LINK;
else
type = DIR;
else
PANIC("can't stat .");
else
type = UNKNOWN;
I guess it should say '.' where is says './..'; if a/b is the same as
b, a/b/.. is the same as b/.., which is '.' (unless you want '..' to
mean the ancestor - strip off the last path component -, in which case
a/b/.. is a).
|
|Since a/b/.. is not the `natural parent' a/b, find prints `a a/b'. The
|file `somefile' has been `lost'. You can still get to it---but if find
|were changed as proposed, it would no longer see it.
Well, I would say, if you play that nasty tricks, you get what you
deserve (just like you get what you deserve when you create circular
hard links). Find, as it is, will also not find all files in a looping
directory structure, simply because it will start repeating itself at
some point.
When the check I proposed is added as an option to find, there is at
least a way to prevent looping if you want to.
Leo.
But that WILL be exactly what he gets! Rejecting artificial links
because those can let you traverse a structure that is anything but a
tree (the simple case can indeed be an artificial link pointing to
another tree, but more complicated ones could point to a tree that
shares some part with the current tree, or - worse - contains the
current tree). I cannot imagine the user wants looping, or trees
traversed twice.
| At least with symlinks, the user has an
|obvious and easy way to determine whether or not a directory entry
|will result in find hitting a dead-end: if "ls -l" shows that the
|entry represents a symlink, then find won't traverse it. But what
|does your scheme really buy that symlinks don't?
You make it sound as if I would advocate the use of hard links to
directories: well, not in the least. What I proposed was just a way to
determine whether a directory link is a true directory (in the sense
that it is consistent with '.' and '..') which optionally can be used
by tree traversal algorithms, like in find. By all means, use symlinks,
that's what I do too (not being superuser and such).
If a user has to "ls -l" to check whether a directory entry is a
symlink and as such would result in find hitting a dead-end, this
strikes me as a non-obvious way, since being consequent he has to do it
for the complete directory structure.
Leo.
They can be implemented on a system that doesn't currently support symlinks
without modifying the kernel.
--
// Conrad Longmore / Uucp: ..!ukc!axion!tharr!conrad / S*E*R*F says: Unix? //
// Bedford College / Janet: tharr!conrad @ uk.ac.ukc / VMS? VM/XE? Bleurgh!//
// Computer Centre / YelNet: +44 234 45151 x5350 / Long live Multics!! //
I've used versions of "mv" that existed before "rename()" existed,
and they did what you describe the "newer versions" as doing; they were
standard "mv"s from standard distribution tapes. I seem to remember
*all* versions of "mv" doing the latter, at least as far back as V7....
Precisely -- that's why on most systems that support symbolic links,
find doesn't traverse symlinks (at least, not by default). My point
was that "artificial" hard links are just a poor imitation of symbolic
links. You can't span file system boundaries with them, and they aren't
as easy to detect as real symlinks (see below). If you prohibit
non-superusers from unlinking them, then you're stuck with the unlovely
situation of users being unable to delete links they've created, even
though they have write access in the affected directory. If you lift
this prohibition, then you can easily end up with directories whose
".." entry points nowhere (or worse yet, points to what used to be
the parent inode, but has now been recycled into something completely
different).
> If a user has to "ls -l" to check whether a directory entry is a
> symlink and as such would result in find hitting a dead-end, this
> strikes me as a non-obvious way, since being consequent he has to do it
> for the complete directory structure.
But your "artificial" hard-links suffer from the same problem, only
more so! In order to distinguish the wolves (artificial hard links)
from the sheep (real hard links), it's necessary to compare the
inode number of "." and "wolf/..". This is far more of a pain in the
ass than just figuring out if "wolf" is a symlink or not.
Don't be too sure that even the superuser can link() to or unlink()
from directories. POSIX.1 doesn't require an implementation to
support link() and unlink() on directories, and implementations that
disallow such actions do indeed exist.
We're only talking about letting users make hard links to directories. Not
about deleting them. Root can always make files by playing games that find
can't find... the question at hand here is if a user can. A user can't
unlink a directory with unlink, so they can't build this structure even if
they can make a hard link to a directory.
So, what you are saying is that all I need is the test, cat, rm, and ln
commands to implement the cp and mv commands?
I'd rather just have the -f option to ln ...
Sorry I didn't cover this in my original posting, but I'm not as old
as you are :-).
Yes, before the rename system call existed, all mv's did basically
the same thing as cp (but they deleted the original after successful
writing of the copy). However, when rename was created, many vendors
(although not all, I'm sure) changed mv to use rename exclusively and
to fail if the rename failed. Many of those same vendors have more
recently once again modified mv to do what the original, pre-rename mv
did if the rename system call fails because of an attempt at
cross-device renaming.
Versions of mv that know how to copy have become more and more
important (and therefore more and more common) as more filesystem
types (NFS, AFS, RFS, RVD, and what-have-you) have been developed,
and as hardware has become more powerful. After all, the more
filesystem types you have, and the more powerful your hardware, the
more likely it is that any particular "mv" you do is going to be
across two different devices.
Before the rename() system call existed, "mv" attempted to link() the
old name to the new name and then to unlink() the old name. If the
link() failed (e.g., because it crossed a mount point) then it would
copy the file instead.
I'm not well acquainted with the introduction of rename() into kernels
derived from System V. When Berkeley implemented rename() they
substituted it for link()/unlink() in "mv" and left the copy code
intact. If a company shipping a BSD-derived "mv" removed the code
that copies files when rename() fails, then they deserve all of the
criticisms that can be sent their way.
--
John Bruner Center for Supercomputing R&D, University of Illinois
bru...@csrd.uiuc.edu (217) 244-4476
> the answer is "you can get that behavior without adding a flag to 'ln'
> - either remove the target first using a command that doesn't barf if
> it's not there, or check if it's there first and remove it if it is."
But that doesn't give the same result. If you remove file2 first, then
there is a brief window between the 'rm' and the 'ln' during which no
file named file2 exists. If you use a version of 'ln' that clobbers
file2 and replaces it with a link to file1 in an atomic operation, there
is no such window. The difference might be important in some
applications.
--apb
Alan Barrett, Dept. of Electronic Eng., Univ. of Natal, Durban, South Africa
Internet: Barret...@f4.n494.z5.fidonet.org UniNet-ZA: Barrett@UNDEE
Real Soon Now: Bar...@EE.UND.AC.ZA PSI-Mail: PSI%(6550)13601353::BARRETT
> But that doesn't give the same result. If you remove file2 first, then
> there is a brief window between the 'rm' and the 'ln' during which no
> file named file2 exists. If you use a version of 'ln' that clobbers
> file2 and replaces it with a link to file1 in an atomic operation, there
> is no such window. The difference might be important in some
> applications.
Wrong. There is no way to clobber file2 and replace it with a hard
link in an atomic operation. link(2) requires that the target file
not exist, otherwise it fails with:
EEXIST The link referred to by name2 does
exist.
Versions of ln that allow an existing target file remove it first with
unlink(2). There is a *shorter* window during which file2 doesn't
exist than if rm(1) were used because there isn't the extra time of
starting up a new process in between the link and unlink operations,
but there *is* a window.
--
David J. MacKenzie <d...@eng.umd.edu> <d...@ai.mit.edu>
As far as I know, there's no such atomic operation, either; I sure
haven't seen any such operation in any UNIX system I've ever run into.
Given that, every version of "ln" I know of that removes the target
first has to first "unlink()" the target, and then do the "link()". As
such, the window is still there....
Right. In graph theoretic terms, the find problem is to compute in a directed
graph the set of nodes that are reachable from a given node. It is well-known
how to solve this using standard search techniques such as depth-first.
I wrote a ftwalk function to do this two years ago. It's described in
"An Efficient File Hierarchy Walker" in the Summer '89 USENIX Proceedings.
The upshot is that you can find the reachable set in the presence of
hard-links or symlinks even if there are cycles; and you can do it fast.
The paper also described reimplementations of standard utilities such as
find, ls, etc...
Phong Vo
i agree find may well get scrunted. good. as i said, the best
resolution may well be that someone writes an ftw (or dwalk or whatever)
that WORKS and then change find (and tar and du etc) to use it.
don't forget, this is unix. the user is allowed to screw him/herself.
Multics got this right of course! :-) None of this tedious mucking
about with links into where the files really were... the file in
Multics are really in the directory that they seem to be in -
links are just another file type, like directories. :-) <f/x> smug.
Well, that's right. But you can improve on Multics. I discussed, and
got wide agreement, years ago (on BIX) an old idea of mine: directories
stink. They are a poor indexing system (tall and thin, with poor
physical clustering, instead of squat and bushy, with good physical
clustering), and they are too navigational.
Note: Multics actually has links and synonyms; a file
or directory) may have multiple names, i.e. synonyms.
This covers nearly all the uses of hard links, without
the hazards.
There are instead filesystem organizations where a path name is just a
strings, and you use something like a btree to map that string into a
file pointer, and instead of a current directory you have a current
prefix. "Links" are then just names (or prefixes) that are declared to
be synonyms, and you make a name (or prefix) resolve to another name
(symlinks) or two names resolve to the same file block pointer (hard
links).
My own idea is actually very different: use a file name resolution
service that is totally not hierarchical. File names are really sets of
keywords, as in say "user,pcg,sources,IPC,header,macros", and a file is
identified by any uniquely resolvable subset of the keywords with which
it is registered with the name service, given in any order. Any
underspecified name resolution returns all file pointers that were
selected. There is no concept of current directory, but rather of
default keyword set, which is merged with keyword sets that are not
marked as "absolute".
The advantages are obvious: gone is the need to decide whether you want
for two compilers and their documentation organize the tree as 'ada/doc
c++/doc' or 'doc/ada doc/c++'. You can get rid of links and directories
and traversal programs and all sorts of funny problems.
Note: an interesting thought is that probably we need some
sort of trademarking, i.e. the ability to reserve certain
combinations of keywords (such as 'user,pcg,private') to avoid
cross pollution of name spaces, or a search for ever larger
and more specific keyword sets.
If the implementor is clever efficiency could be dramatic, and probably
not inferior to that of Multics style directory name resolution systems
(not to speak of UNIX style ones), thanks to the ability to use high
density indexing methods with better physical clustering, to offset the
greater overheads for the increased sophistication and flexibility.
I think that a convenient prototype could be done as an NFS server,
(using slashes to separate keywords, why not, to give a UNIX like
flavour to the thing), starting from the free one posted in
comp.sources.unix, and could become a fine dissertation subject. It has
been many years that I have wanted to implement this scheme, but it
seems I will never have the time... Any takers? (I reckon it would make
an especially fine system for Amoeba or other capability based systems).
Note: a very interesting dissertation from Stanford describes a
system that is not too unlike this, except that the author used
keyword based queries not for the name but for attributes of the
file, e.g. owner, size, etc... The results were fairly
encouraging, even in terms of performance, even with a not fully
tuned prototype implementation.
--
Piercarlo "Peter" Grandi | ARPA: pcg%cs.abe...@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: p...@cs.aber.ac.uk
Pretty ****ing dumb of them, given that the BSD "mv" didn't do something
that silly, as I remember (4.3BSD's sure doesn't); if they got
"rename()" from 4.2BSD, one presumes they would have gotten a rational
"mv" from there as well.
No, it is not understandable. It doesn't take much work to do a
mknod() on the device major/minor numbers from stat().
The description for mv(1) is
"mv moves (changes the name of) file1 to file2."
On the bottom, under "Notes:" it says that mv must copy the file
it it exists on another filesystem. So, someone did half the
work to notice the filesystems were different, but never bothered
to see if it was a device or FIFO object.
My complaint is because 1). the behavior is useful [ obviously, since
AT&T and BSD both have ln's with different behaviors and no one has yet
decided that either is patently stupid ] 2). the behaviors are
different so you can't know whether the ln on the system you are using
is going to fail or succeed, depending on your definition of failure
or success. Which means you are forced to use "rm -f $2 ; ln $1 $2"
to have the desired effect [ again, or not, depending on what "the
desired effect" is ].
Standardizing on "ln" and "ln -f" behavior, as I believe POSIX is trying
to do, will resolve the problem - there will be exactly one set of
POSIX-like "ln" and "ln -f" behavior.
(a) There still remain link counts. Just make rmdir check link counts.
It already checks to see if the directory is empty. Then it will
say "rmdir: Directory has firm links".
(b) The inode won't be reused. Link counts again.
This is a perfect description of the UNIX filesystem.
> My own idea is actually very different: use a file name resolution
> service that is totally not hierarchical. File names are really sets of
> keywords, as in say "user,pcg,sources,IPC,header,macros", and a file is
> identified by any uniquely resolvable subset of the keywords with which
> it is registered with the name service, given in any order.
Ugh, ugh, ugh, ugh, ugh. I find it just as important to know that a file
doesn't exist as to find it if it does. By removing any hint of absolute
structure, you're guaranteeing that people will continually access the
wrong files without even noticing what went wrong. Do you never see the
words ``file not found''? Would you rather that the filesystem silently
read or touch or trash a file somewhere else?
---Dan
Hmm. Couldn't something like that be done with symbolic links?
I've never seen a symbolic link that was other than:
'lrwxrwxrwx' (0777)
I'm curious as to what meaning the flags have to a symbolic link?
_MY_ guess (not having access to a BSD machine, or manuals anymore, I can't
check and/or do experiments) is that the reading would refer to being allowed to
find out what the symbolic link refers to, write being able to change it what
it refers to, and search/execute, being able to follow it.
Seems to me that this is a HUGE security hole (0777 mode that is), so my
guess is that I'm wrong, are the mode bits simply ignored?
If above is true (or other permissions are needed to modify a symbolic link,
such as write permission to directory [changing a symbolic link is not possible,
and one must delete it, and re-create it], or at least only the creator can change it),
has anyone ever set the mode to anything other than my equivalent of 0555? To what
purpose?
--
:!mcr!: | < political commentary currently undergoing Senate >
Michael Richardson | < committee review. Returning next house session. >
Play: m...@julie.UUCP Work: mic...@fts1.UUCP Fido: 1:163/109.10 1:163/138
Amiga----^ - Pay attention only to _MY_ opinions. - ^--Amiga--^
Since we're talking about a hypothetical system where mkdir, rmdir, find,
ln, rm, and a few other application programs have been modified to make
links to directories "safe", not a real system, this is pretty much
irrelevant.
Hey, I wasn't saying this situation *was* true on any existing UNIX
systems -- I was describing my objections to Peter da Silva's proposed
implementation of directory hard links. Peter's scheme would disallow
unlinking the "real" entry for a directory if the directory's link count
indicated that there were other (non-primary) hard links to it. His
scheme would work, but only at the expense of opening the system up to
unpleasant situations like the one I described above.
...has nothing to do with the issue being discussed in the article to
which you're following up; that was discussing the non-existence of an
atomic "remove and link" operation, which the previous poster had,
apparently, assumed existed in UNIX and would be used by an "ln" that
removed the target first.
But, in any case...
>is because 1). the behavior is useful [ obviously, since
>AT&T and BSD both have ln's with different behaviors and no one has yet
>decided that either is patently stupid ] 2). the behaviors are
>different so you can't know whether the ln on the system you are using
>is going to fail or succeed, depending on your definition of failure
>or success.
Which means the ultimate problem isn't that one behavior is "good" and the
other is "bad", but that they're *different*. Standardizing on either
one would have worked (modulo windows opened by having to implement one
behavior with multiple commands on a system that provides the other).
But my basic point still stands. If you add such a restriction (based
on link counts instead of root privileges this time), then you're
still stuck with the "unlovely situation" I mentioned (see below for
example). And if you don't add the prohibition, you're stuck with
the ".." problem.
Here's an example of the problem:
1) I create a subdirectory named "sub".
2) Unbeknownst to be me, Joe Schmo creates a hard link of his own
to "sub".
3) I try to rmdir "sub", which is empty, and find that I cannot,
because its link count is > 2.
So now I'm stuck with a subdirectory that I own that lives in a
directory that I can write, but I can't delete it! All I know is how
many extra links to it exist -- and I have no way of finding out
*where* those links are. Contrast this case to the deletion of an
ordinary file with many links, and you'll see the difference. There's
nothing preventing me unlinking the file, yet there is for the
directory.
That is the behavior I find objectionable.
There's hardly a difference; things really get nice if quotas are enabled...
)That is the behavior I find objectionable.
Indeed. Perhaps another file attribute could be of service here:
permission to make a hard link. Or get rid of hard links altogether and
use symlinks instead.
--
"and with a sudden plop it lands on usenet. what is it? omigosh, it must[...]
be a new user! quick kill it before it multiplies!" (Loren J. Miller)
Lets take it one step further. Suppose You have a quota and you fave a file
XXX. Joe Schmo cretes a hrd link to file XXX. Later you delete the file,
but XXX exists as another file name, so your quota is still charged for file
XXX. This can happen, also Joe Schmo could (if he's an annoying user)
link system files to random places (Granted there are ways of finding the
links.) This type of action may be deploreable, but I believe it raises the
question of letting the user make hard links at all.
Is there a particular reason that a user should be allowed to hard link a file
if he doesn't own it and doesn't have write permission to it?
--
Christopher A. Provenzano | System Administration is like
uucp: uunet!mimsy!proven | juggling, the more systems you have,
email: pro...@eng.umd.edu | the more likely they will be down...
voice: (301) 454-7705 |
>Which means the ultimate problem isn't that one behavior is "good" and the
>other is "bad", but that they're *different*. Standardizing on either
>one would have worked (modulo windows opened by having to implement one
>behavior with multiple commands on a system that provides the other).
On systems without ln -f (where ln defaults to removing the target if
it already exists) another program is required to perform a link which
fails if the target exists. On SysVr[23] on 3B2's, this program exists
as /etc/link:
-r-x------ 1 root bin 1716 Mar 3 1988 /etc/link
Thus, unless someone changes things there is no atomic file action
that can be used by an ordinary user in a shell script. I'd call
that "bad".
Les Mikesell
l...@chinet.chi.il.us
> 2) Unbeknownst to be me, Joe Schmo creates a hard link of his own
> to "sub".
> 3) I try to rmdir "sub", which is empty, and find that I cannot,
> because its link count is > 2.
So rename it, and bitch to your system admin guy. The only problem with this
would be if it was a BIG directory and was still on your disk quota (if your
system does such things), and Joe Schmo could still screw you up that way
if it was a file.
Yes, symlinks are more useful. Unfortunately they're still not universally
available. This is a trivial change at the application level for systems that
don't yet support the newer method.
Personally I'm more upset about the fact "cat" still doesn't use perror.
Lets take it one step further. Suppose You have a quota and you
fave a file XXX. Joe Schmo cretes a hrd link to file XXX. Later you
delete the file, but XXX exists as another file name, so your quota
is still charged for file XXX. This can happen, also Joe Schmo
Maybe the system's quota or file ownership policies can be changed?
For example, if you unlink the file, perhaps the file can be somehow
put in his quota?
Or maybe everybody can have a workstation with its own disk.
--
Neither representing any company nor, necessarily, myself.
... or maybe the BSD implementation of quotas is simply a hack.
>As far as I know, there's no such atomic operation, either; I sure
>haven't seen any such operation in any UNIX system I've ever run into.
>Given that, every version of "ln" I know of that removes the target
>first has to first "unlink()" the target, and then do the "link()". As
>such, the window is still there....
Why don't the versions of ln that you know of on 4.3 BSD and SunOS 4
use the rename system call?
idavolde 2 >man rename
...
DESCRIPTION
Rename causes the link named from to be renamed as to. If
to exists, then it is first removed. ...
Rename guarantees that an instance of to will always exist,
even if the system should crash in the middle of the opera-
tion.
I've heard that it was precisely because of the various race
conditions in connection with ln that made Berkeley put the function
into the kernel where inodes could be locked and updates done in
proper order.
--
Lars Mathiesen, DIKU, U of Copenhagen, Denmark [uunet!]mcsun!diku!thorinn
Institute of Datalogy -- we're scientists, not engineers. tho...@diku.dk
Because they're named "ln", not "mv".
"rename()" is *NOT* an atomic operation that removes the target and
makes an *additional* link from the source with the name of the target.
It's an atomic operation that *renames* the source to the target, and if
it succeeds does *not* leave the source behind.
Because rename doesn't do the same thing as ln.
If before you run ln, the file has n links, after you run ln
successfully the file will have n + 1 links.
If you just rename the file, it will still have n links, but one of
them will be different from before.
>["rename" is]
> an atomic operation that *renames* the source to the target, and if
>it succeeds does *not* leave the source behind.
I [usually] know that. But now that I think about it: Why doesn't ln:
1) Create a new, randomly named, link in the target directory.
2) Use "rename" to atomically replace the target with that link?
There will be no time when the target name doesn't exist. If the
machine crashes (or ln is killed), a funny link may be left behind,
but I think that is a smaller problem than the timing race.
I don't know if it would be wise to make this the default behaviour of
ln, but it might be useful to provide it as an option. (Of course, the
method would only be used if the target already existed.)
> The "correct" solution, perhaps, is to make mv check for special
>files after the rename() fails, and to use the mknod code to recreate
>special files. The problem with this is that it would require mv to
>be setuid. Perhaps a better solution would be for mv to notice
>special files and print out an error like "cannot rename special files
>across partitions"; that is, after all, what it used to print when you
>tried to mv regular files across partitions (without the word
>"special").
How about requiring the user to have an effective uid of 0? ("must be root to
move /dev/barf to /usr/tmp/barf") Having "mv" setuid gives me the willies. Too
much room for a poor implementation to be fooled into doing something that the
normal UNIX protections would otherwise disallow.
>Jonathan Kamens USnail:
>MIT Project Athena 11 Ashford Terrace
>j...@Athena.MIT.EDU Allston, MA 02134
>Office: 617-253-8495 Home: 617-782-0710
--
* Daniel R. Levy * uunet!tellab5!mtcchi!levy * This is not on behalf of MTC *
So far as I can remember, there is not one | ... therefore be as shrewd as
word in the Gospels in praise of intelligence.| serpents [see Gen. 3] and harm-
-- Bertrand Russell [Berkeley UNIX fortune] | less as doves -- JC [Mt. 10:16]