Unix Access Rights
All Unix files are marked with access rights. Files may be:
Typically, these rights are referred to in shorthand using a three-letter access rights designation. The following combinations are the most commonly used, out of a total of 8 combinations:
The only combinations that has no well understood uses is -wx, meaning you can change it or execute it, but you can't read it. Write-only files, marked -w-, are rare but have well-defined uses as log files, where the program or programs writing the log have no legitimate reasons to ever read back the entries they've written.
There is no law that these are the only access rights. Many users have wanted to have other rights. For example, in some contexts, it makes good sense to allow a user to append to a file but not to write arbitrarily over existing data in the file. Unix does not provide this right, while some other systems do.
In the general case, the entire idea of a fixed set of access rights is insufficient. In reality, every file is the representation of some kind of abstract object. Source files, databases, video files, and object files each have different appropriate operations, and files created to support new applications generally support new operations. In an ideal world, the access rights would correspond to methods on abstract objects.
Unix distinguishes three classes of users:
To support this, each file has an owner ID and a group ID, and each user has a user ID and a group ID. For small systems, groups are of little value, but in a large system such as a departmental server, we can have a group for each class, or a group for faculty and another group for teaching assistants.
Again, Unix access rights are fairly limited. A file can only belong to one group at a time (although the chgrp command can be used to change the group of any file, so long as the user is a member of that group and the user owns the file).
The full list of rights for a Unix file is usually printed as 10 characters, a single character indicating whether or not the file is a directory, and then three groups of three, starting with the owner rights and ending with the public rights. Type ls -l to see the rights for some file or files:
> ls -l myls notes -rwxr-xr-x 1 jones jones 86 Feb 19 09:51 myls -rw----r-- 1 jones class 272 Feb 19 09:37 notes
Neither file listed above is a directory. The first file, myls, has owner access rights rwx while the group and public access rights are set to r-x. The second file, notes has owner rights rw-, no group access, and public rights r--. The group of the second file was set to class, so all members of this group will be denied access to the file even though other members of the public have read access.
The Unix file system has one strange interpretation of access rights. For directories, the execute right confers the right so traverse that directory during the interpretation of a file's path name. Consider the following
> cd ~jones > ls -dl . drwx--s--x 57 jones faculty 4096 Feb 19 11:15 .
This directory (the home directory of a faculty member named jones) has rwx rights for that user, but others, members of the faculty and members of the general public, only have --x rights. Others can not list this directory or add links to it, but they can follow paths that. pass through the directory. Therefore, for example, members of the public can access ~jones/.public-html/index.shtml, that user's home page on the web.
In fact, the web browser runs under its own user ID, so the web browser has no access to the files of this faculty member named jones, aside from the access that a member of the general public would have.
The Unix model of setting up three different access categories for a file is flexible, but it is seriously limited. It is not difficult to construct situations where files need to be accessible to two different groups of users quite distinctly from either the personal access rights of the file owner or the rights conferred to the public. Consider the following:
In a team-taught course, we are interested in files where a group of faculty for the course are equally able to update the content, and a group of students in the course are equally able to read the content, while the general public may be excluded. The Unix model does not provide good support for this.
In fact, Unix began as a less-ambitious scaled-down version of Multics, the big federally funded operating system that was originally a joint venture between MIT, GE and Bell Labs. As such, it's access control model was a simplified version of the original Multics idea.
In Multics, each file could have an arbitrary length access control list, where each entry paired a user name with the access rights of that user. This scheme is entirely general, and many modern versions of Unix attempted to add this generality back. Unfortunately, afterthoughts added to operating system designs rarely fit in cleanly and are rarely used effectively by users.
The Unix system and its descendants maintain at least three independent notions of file type.
The most fundamental is the distinction between directories and data files. Directories may be opened for read access just like any file, but the only way to write a directory is to create a file in it, add a link to it, or remove a link from it.
For executable files, there is a second kind of file type created by the magic number at the start of the file. Unix and Unix-like systems rely on this magic number to determine what interpreter (whether software or the machine's bare hardware) to use to execute that file.
Finally, there is the file extension used by many applications to decide what application to use for editing or compiling a file. The Unix kernel does not understand file extensions at all. If you want to create a shell script named myscript.c, both the kernel and most Unix shells will be perfectly happy with this. Only the applications themselves pay attention to the presence of dots within command names and give the file name extension any special value.
In this last regard, Unix is quite different from Windows, where the file name extension has mandatory meaning.
In fact, what we really want is a clear notion of file type, where files of type directory are edited using directory management operations, while files of type "C program" cannot be executed but can be compiled by a C compiler, and files of type "shell script" can be executed by the designated shell. In effect, what we really want is a polymorphic class "file" to which the methods edit, execute and process can be applied. Sadly, our current operating systems don't work this way.
Suppose two users, call them Alice and Bob, are deeply suspicious of each other. Alice has written a really useful software package that Bob wants to use, but Alice is certain that Bob wants to steal and resell the package. Bob has a database he values very greatly, and he hesitates to use Alice's package on it for fear that her code will open his files and steal his data. Both Alice and Bob store all of their files on the same file server.
This problem was first formulated in the mid 1960s, in the context of the Multics computer system. Multics was supposed to be a computer utility, shared by many customers. In fact, Multics was the prototype for the institution we now call an ISP, offering file hosting and other shared information processing services to the public.
We can begin to solve this problem is Alice marks her package as a public execute only package. This way, Bob can run Alice's software, but he can't make a copy. The problem is, under the normal rules for launching an application, when Bob runs the code, it runs with the Bob's user ID, so it can open all of Bob's private files and steal copies of his data.
The Unix solution to this problem was perhaps the only genuine new invention in the development of Unix -- just about everything else in that system was invented by someone else and then incorporated into Unix.
In Unix, each file has a special bit in the access rights list, called the set user ID bit or SUID bit. For executable files, if this bit is set, executing that file with any variant of the exec system call will change the user ID of the current process to the owner of the file.
So, Alice can mark her executables in her package as publically executable with the SUID bit set (reported by the ls -l command as the access rights string rws-----x). The s in the owner rights, where the executable status would normally be reported, indicates that the file is executable with the SUID bit set.
The result is that, when Bob runs any executable from Alice's package, it runs as if Alice had run it. It cannot open any of Bob's files unless those files have appropriate group or public permissions that allow Alice to open them herself.
This then raises a question: How can Bob use Alice's programs to operate on his data? Alice's programs cannot open Bob's files, but Bob can open his own files. Therefore, if he opens a file before calling some version of exec to launch Alice's applicaton, he can pass the open file to Alice.
The most common way this is done is to use standard in and standard out to pass data to Alice's application and to get data back out. So, if Bob wants to run Alice's application called alap to process the data in his file called bobfile, he could issue this shell command:
> alap < bobfile > bobfile
If Alice's application needs to open one of Bob's files midway through its operation, as originally concieved, it could not do so directly. It could call an SUID applicaiton of Bob's to open the file, and if Bob and Alice agreed on the calling conventions, Bob's application could pipe the desired data to Alice's code.
Most Unix system utilities under Unix have the SUID bit set, so that when a user runs the utility, it gains privileged status by virtue of this mechanism. Few Unix users understand how to use the SUID bit for applications.
It should also be noted that the Unix system has a SGID bit to set the group ID of a process.
The above discussion of the SUID and SGID bits applies to those bits as attributes of executable files. In the case of directories, since you cannot execute them, these bits get used for an entirely different purpose.
If the SUID or SGID bit is set for a directory, then when new files are created in that directory, they automatically inherit the ownership or group of that directory instead of being owned by or in the the group of the creating process.
Newly created files have access rights set by the file creation mask. This mask can be set or displayed by the Unix umask kernel call and built-in shell commands. The program creating the file sets the maximum rights that make sense for that file. Most text files are, by default, rw-rw-rw- while executable code is, by default rwxrwxrwx. The mask includes the bits that are to be excluded from these rights. Unfortunately for users, the mask is coded in octal, so instead of writing umask ----w--w- to turn off group and public write-access to newly created files, you write umask 022. Having set this value for the file creation mask, text files will be created as rw-r--r-- and executable files as rwxr-xr-x.
The chmod shell command for changing the access rights on a file allows the rights to be set either in symbolic or numeric form, so chmod rwxr-xr-x myfile and chmod 0755 myfile do exactly the same thing. The chmod command also allows incremental changes to the access rights, so chmod +x myfile adds executable rights to the file, and chmod -w myfile removes the right to write the file. If you want to just change the user, group or other rights, you can use prefixes to control the edit, so chmod go-w myfile removes the write rights from the group and other fields, while leaving the owner's rights unchanged, and chmod u+x myfile makes the file executable only by the file's owner.
After having invented a very strong mechanism, the designers of Unix, and those who enhanced this original design, proceeded to shoot themselves in the feet, defeating the beautiful solution they had created to the mutual distrust problem.
The problem is, the Unix system maintains the original or real user ID of each process, as well as the current effective user ID. A program running under a user or group ID obtained from the SUID or SGID mechanisms can revert to the real user ID using the setuid or setgid commands. These commands allow any program to change from its effective user ID to its real user ID at any time, and there is no clean for a distrustful user to prevent this.
The Wikipedia entries are useful: