IC221: Systems Programming (SP14)


Home Policy Calendar

Lecture 02: Globing, Ownership, Groups, and Permissions

Table of Contents

1 Pattern Matching in Unix File System

When working in the command line terminal, a common desire is to perform operations over many files at once. For example, maybe we want to print the contents using cat of all files that start with the name "file". If the directory contained many hundreds of such files, then writing out each file name by hand would be terrible.

To alleviate such annoyances, the unix shell provides a simple mechanisms for describing pattern matching. The process of describing patterns to match a string is well studied in computer science and is the basis of many theoretical thought. In this class we will apply pattern matching in the shell to perform simple tasks, and, in fact, you've seen some of this already when using grep in the last lab. Generally, we will use two pattern matching techniques:

  1. Globbing: A shell driven pattern matching facility that allows the user to generally matching patterns in files and directories.
  2. Regular Expressions : An expressive language of pattern matching that can match contents and names of files and directories through other command line tools, like sed and grep.

In this lecture, we will focus primarily on globbing, and in the lab, you will have some basic exposure to regular expressions.

2 Globbing

The globbing is actually a process of filename expansion which provides a way for the user to refer to multiple files at the same time. The key idea behind globing is that you can use a wildcard that can match anything and the wildcard then expands the command line with all the matched files or directories given the pattern. Think of this like a wildcard in poker: if you have a wildcard, then it can become any card you like. In the same way, when you use a wildcard in a pattern, it can inhabit any character you want to match in some specified way.

There are three primary wildcards used for globbing:

  1. * – match zero or more
  2. ? – match exactly one
  3. [] – match exactly one from the set
  4. [^ ] – match exactly one thing not from the set

2.1 Match zero or more with *

As an example, consider the contents of this directory with 200 files:

#> ls
file.0    file.115  file.132  file.15   file.167  file.184  file.200  file.38   file.55   file.72   file.9
file.1    file.116  file.133  file.150  file.168  file.185  file.21   file.39   file.56   file.73   file.90
file.10   file.117  file.134  file.151  file.169  file.186  file.22   file.4    file.57   file.74   file.91
file.100  file.118  file.135  file.152  file.17   file.187  file.23   file.40   file.58   file.75   file.92
file.101  file.119  file.136  file.153  file.170  file.188  file.24   file.41   file.59   file.76   file.93
file.102  file.12   file.137  file.154  file.171  file.189  file.25   file.42   file.6    file.77   file.94
file.103  file.120  file.138  file.155  file.172  file.19   file.26   file.43   file.60   file.78   file.95
file.104  file.121  file.139  file.156  file.173  file.190  file.27   file.44   file.61   file.79   file.96
file.105  file.122  file.14   file.157  file.174  file.191  file.28   file.45   file.62   file.8    file.97
file.106  file.123  file.140  file.158  file.175  file.192  file.29   file.46   file.63   file.80   file.98
file.107  file.124  file.141  file.159  file.176  file.193  file.3    file.47   file.64   file.81   file.99
file.108  file.125  file.142  file.16   file.177  file.194  file.30   file.48   file.65   file.82
file.109  file.126  file.143  file.160  file.178  file.195  file.31   file.49   file.66   file.83
file.11   file.127  file.144  file.161  file.179  file.196  file.32   file.5    file.67   file.84
file.110  file.128  file.145  file.162  file.18   file.197  file.33   file.50   file.68   file.85
file.111  file.129  file.146  file.163  file.180  file.198  file.34   file.51   file.69   file.86
file.112  file.13   file.147  file.164  file.181  file.199  file.35   file.52   file.7    file.87
file.113  file.130  file.148  file.165  file.182  file.2    file.36   file.53   file.70   file.88
file.114  file.131  file.149  file.166  file.183  file.20   file.37   file.54   file.71   file.89

Now Suppose I want to just match all files that begin with file.4 but can end in anything. So the file, file.4 would match as well as the file file.44 and file.41. To do so I can use a wildcard, the * or Asterix like this file.4*.

#>ls file.4*
file.4   file.40  file.41  file.42  file.43  file.44  file.45  file.46  file.47  file.48  file.49 

The * symbole semantically says: "Match zero or more items." That means file.4 matches since there are zero items following the "4", as well as file.41 matches since there is one item following the "4", the "1". You can more clearly see that * matches zero or more items by looking aht the glob file.1*:

#>ls file.1*
file.1    file.109  file.119  file.129  file.139  file.149  file.159  file.169  file.179  file.189  file.199
file.10   file.11   file.12   file.13   file.14   file.15   file.16   file.17   file.18   file.19
file.100  file.110  file.120  file.130  file.140  file.150  file.160  file.170  file.180  file.190
file.101  file.111  file.121  file.131  file.141  file.151  file.161  file.171  file.181  file.191
file.102  file.112  file.122  file.132  file.142  file.152  file.162  file.172  file.182  file.192
file.103  file.113  file.123  file.133  file.143  file.153  file.163  file.173  file.183  file.193
file.104  file.114  file.124  file.134  file.144  file.154  file.164  file.174  file.184  file.194
file.105  file.115  file.125  file.135  file.145  file.155  file.165  file.175  file.185  file.195
file.106  file.116  file.126  file.136  file.146  file.156  file.166  file.176  file.186  file.196
file.107  file.117  file.127  file.137  file.147  file.157  file.167  file.177  file.187  file.197
file.108  file.118  file.128  file.138  file.148  file.158  file.168  file.178  file.188  file.198

2.2 Match exactly 1 with ?

There are situations when you only want to match only 1 character. For example, suppose we want to only list the files that file.40 through file.49, but not list file.41. If the only wildcard we had was the *, then we would not be able to write a glob for that, and worse, if the possible strings also contained items like file.400, then excluding that would not be possible.

Instead, we need a glob wildcard that can do a limited match. For that we use a ? wildcard which matchs exactly 1 item. So we can now write the condition for file.40 through file.49 as the glob file.4?.

#>ls file.4?
file.40  file.41  file.42  file.43  file.44  file.45  file.46  file.47  file.48  file.49

Notice that file.4 does not match file.4? because file.4 does not have a suffix with at least one more character, as necessitated by the ? wildcard. You can also include the ? within a glob.

#> ls file.1?5
file.105  file.115  file.125  file.135  file.145  file.155  file.165  file.175  file.185  file.195

On your own, what would file.1?? match?

2.3 Match from a set with [ ] and [ ^ ]

Finally, to complete the matching capabilitie of globs, we need a way to match from a subset of choices. Consider a situation where you want to match all files matching file.13? or file.15?. That is, the second digit in the file can either be a 3 or a 5, and we can describe that using a [] wildcard like file.1[35]?

#> ls file.1[35]?
file.130  file.132  file.134  file.136  file.138  file.150  file.152  file.154  file.156  file.158
file.131  file.133  file.135  file.137  file.139  file.151  file.153  file.155  file.157  file.159

You can also negate a set, stating matching anything that is not in the [].

#>ls file.1[^35]?
file.100  file.108  file.116  file.124  file.142  file.160  file.168  file.176  file.184  file.192
file.101  file.109  file.117  file.125  file.143  file.161  file.169  file.177  file.185  file.193
file.102  file.110  file.118  file.126  file.144  file.162  file.170  file.178  file.186  file.194
file.103  file.111  file.119  file.127  file.145  file.163  file.171  file.179  file.187  file.195
file.104  file.112  file.120  file.128  file.146  file.164  file.172  file.180  file.188  file.196
file.105  file.113  file.121  file.129  file.147  file.165  file.173  file.181  file.189  file.197
file.106  file.114  file.122  file.140  file.148  file.166  file.174  file.182  file.190  file.198
file.107  file.115  file.123  file.141  file.149  file.167  file.175  file.183  file.191  file.199

Note, that set glob is like a ? in that it matches 1 or more, so the glob file.[1][1][1] will only match file.111.

2.4 Subdirectory Matching

Globs can be used to match subdirectories. Consider the following directory layout:

#>ls
dir.a/  dir.ab/ dir.ad/ dir.ba/ dir.bc/ dir.be/ dir.ca/ dir.cc/ dir.ce/ dir.da/ dir.dc/ dir.de/ dir.eb/ dir.ed/
dir.aa/ dir.ac/ dir.b/  dir.bb/ dir.bd/ dir.c/  dir.cb/ dir.cd/ dir.d/  dir.db/ dir.dd/ dir.ea/ dir.ec/ dir.ee/

And each directory has the following files, which you can explore with the ls glob.

#>ls *
dir.a:
file.0   file.10  file.12  file.14  file.16  file.18  file.2   file.3   file.5   file.7   file.9
file.1   file.11  file.13  file.15  file.17  file.19  file.20  file.4   file.6   file.8

dir.aa:
file.0   file.10  file.12  file.14  file.16  file.18  file.2   file.3   file.5   file.7   file.9
file.1   file.11  file.13  file.15  file.17  file.19  file.20  file.4   file.6   file.8

(...)

The file expansion mechanism for a pattern allow the user to match the entire path. Consider an individual file file.10 suppose we wanted to match all instances of that file across the subdirectories, we can use the following pattern:

#> ls */file.10
dir.a/file.10   dir.ad/file.10  dir.bc/file.10  dir.ca/file.10  dir.ce/file.10  dir.dc/file.10  dir.eb/file.10
dir.aa/file.10  dir.b/file.10   dir.bd/file.10  dir.cb/file.10  dir.d/file.10   dir.dd/file.10  dir.ec/file.10
dir.ab/file.10  dir.ba/file.10  dir.be/file.10  dir.cc/file.10  dir.da/file.10  dir.de/file.10  dir.ed/file.10
dir.ac/file.10  dir.bb/file.10  dir.c/file.10   dir.cd/file.10  dir.db/file.10  dir.ea/file.10  dir.ee/file.10

All the directories match the * and thus the pattern refers to file.10 as exists in each of the subdirectories. These can be built upon each other in ever more complex ways. For example, here is a pattern to match all files ending with a 5 or a 0, but not file.5 or file.0, and only in directories ending in "a."

#> ls dir.*a/file.?[05]
dir.a/file.10   dir.aa/file.10  dir.ba/file.10  dir.ca/file.10  dir.da/file.10  dir.ea/file.10
dir.a/file.15   dir.aa/file.15  dir.ba/file.15  dir.ca/file.15  dir.da/file.15  dir.ea/file.15
dir.a/file.20   dir.aa/file.20  dir.ba/file.20  dir.ca/file.20  dir.da/file.20  dir.ea/file.20

3 File Permissions and Ownership chmod and chown

Continuing our exploration of the UNIX file system and command line operations, we now turn our attention to the file ownership and permissions. One of the most important services that the OS provides is security oriented, ensuring that the right user access the right file in the right way.

Lets first remind ourselves of the properties of a file that are returned by running ls -l:

.- Directory?
|    .-------Permissions                   .- Directory Name
| ___|___     .----- Owner                 |
v/       \    V     ,---- Group            V
drwxr-x--x 4 aviv scs 4096 Dec 17 15:14 ic221
-rw------- 1 aviv scs 400  Dec 19  2013 .ssh/id_rsa.pub
                       ^   \__________/    ^   
File Size -------------'       |           '- File Name
  in bytes                     |              
                               |
   Last Modified --------------'

There are two important parts to this discussion: the owner/group and the permissions. The owner and the permissions are directly related to each. Often permissions are assigned based on user status to the file, either being the owner or part of a group of users who have certain access to the file.

3.1 File Ownership and Groups

The owner of a file is the user that is directly responsible for the file and has special status with respect to the file permission. Users can also be grouped together in group, a collection of users who posses the same permissions. A file also has a group designation to specify which permission should apply.

You all are already aware of your username. You use it all the time, and it should be a part of your command prompt. To have UNIX tell you your username, use the command, who am i:

aviv@saddleback: ~ $ who am i
aviv     pts/24       2014-12-29 10:44 (potbelly.academy.usna.edu)

The first part of the output is the username, for me that is aviv, for you it will be your username. The rest of the information in the output refers to the terminal, the time the terminal was created, and from which host you are connected. We will learn about terminals later in the semester. (And yes, I name my computers after pigs.)

You can determine which groups you are in using the groups command.

aviv@saddleback: ~ $ groups
scs sudo

On this computer, I am in the scs which is for computer science faculty members. I am also in the sudo group, which is for users who have super user access to the machine. Since saddleback is my personal work computer, I have sudo access.

3.2 The password and group file

Groupings are defined in two places. The first is a file called /etc/passwd which manages all the users of the system. Here is my /etc/passwd entry:

aviv@saddleback: ~ $ grep aviv /etc/passwd
aviv:x:35001:10120:Adam Aviv {}:/home/scs/aviv:/bin/bash

The first two parts of that file describe the userid and groupid, which are 35001 and 10120, respectively. These numbers are the actually group and user names, but Unix nicely converts these numbers into names for our convenience. The translation between userid and username is in the password file. The translation between groupid and group name is in the group file, /etc/group. Here is the SCS entry in the group file:

aviv@saddleback: ~ $ grep scs /etc/group
scs:*:10120:webadmin,www-data,lucas,slack

There you can see that the users webadmin, www-data, lucas and slack are also in the SCS group. While my username is not listed directly, I am still in the scs group as defined by the entry in the password file.

Take a moment to explore these files and the commands. See what groups you are in.

3.3 File Permissions

We can now turn our attention to the permission string. A permission is simply a sequence of 9 bits broken into 3 octets of 3 bits each. An octet is a base 8 number that goes from 0 to 7, and 3 bits unique define an octet since all the numbers between 0 and 7 can be represtend in 3 bits.

Within an octet, there are three permission flags, read, write and execute. These are often referred to by their short hand, r, w, and x. The setting of a permission to on means that the bit is 1. Thus for a set of possible permission states, we can unique define it by a octal number

rwx -> 1 1 1 -> 7
r-x -> 1 0 1 -> 5
--x -> 0 0 1 -> 1
rw- -> 1 1 0 -> 6

A full file permission consists of the octet set in order of user, group, and global permission.

 ,-Directory Bit
|       
|       ,--- Global Permission
v      / \
-rwxr-xr-x 
 \_/\_/
  |  `--Group Permission
  |   
   `-- User Permissoin

These define the permission for the user of the file, what users in the same group of the file, and what everyone else can do. For a full permission, we can now define it as 3 octal numbers:

-rwxrwxrwx -> 111 111 111 -> 7 7 7 
-rwxrw-rw- -> 111 110 110 -> 7 6 6
-rwxr-xr-x -> 111 101 101 -> 7 5 5

To change a file permission, you use the chmod command and indicate the new permission through the octal. For example, in part5 directory, there is an executable file hello_world. Let's try and execute it. To do so, we insert a ./ in the front to tell the shell to execute the local file.

> ./hello_world
-bash: ./hello_world: Permission denied

The shell returns with a permission denied. That's because the execute bit is not set.

#> ls -l hello_world 
-rw------- 1 aviv scs 7856 Dec 23 13:51 hello_world

Let's start by making the file just executable by the user, the permission 700. And now we can execute the file:

#> chmod 700 hello_world 
#> ls -l hello_world
-rwx------ 1 aviv scs 7856 Dec 23 13:51 hello_world
#> ./hello_world
Hellow World!

This file can only be execute by the user, not by anyone else because the permission for the group and the world are still 0. To add group and world permission to execute, we use the permission setting 711:

#> chmod 711 hello_world 
#> ls -l hello_world 
-rwx--x--x 1 aviv scs 7856 Dec 23 13:51 hello_world

At times using octets can be cumbersome, for example, when you want to set all the execute or read bits but don't want to calculate the octet. In those cases you can use shorhands.

  • r, w, x shorthands for permission bit read, write and execute
  • The + indicates to add a permission, as in +x or +w
  • The - indacetes to remove a permission, as in -x or -w
  • u, g, a shorthands for permission bit user, group, and gloabl (or all)

Then we can change the permission

chmod +x file   <-- set all the execute bits
chmod a+r file  <-- set the file world readable
chmod -r  file  <-- unset all the read bits
chmod gu+w file <-- set the group and user write bits to true

Depending on the situations, both the octets and the shorthands are preferred.

3.4 Changing File Ownership and Group

The last piece of the puzzle is how do we change the ownership and group of a file. Two commands:

  • chown user file/directory : change owner of the file/directory to the user
  • chgrp group file.directory : change group of the file to the group

Permission to change the owner of a file is reserved only for the super user for security reasons. However, changing the group of the file is reserved only for the owner.

aviv@saddleback: demo $ ls -l
total 16
-rwxr-x--- 1 aviv scs 9133 Dec 29 10:39 helloworld
-rw-r----- 1 aviv scs   99 Dec 29 10:39 helloworld.cpp
aviv@saddleback: demo $ chgrp mids helloworld
aviv@saddleback: demo $ ls -l
total 16
-rwxr-x--- 1 aviv mids 9133 Dec 29 10:39 helloworld
-rw-r----- 1 aviv scs    99 Dec 29 10:39 helloworld.cpp

Note now the hello world program is in the mids group. I can still execute it because I am the owner:

aviv@saddleback: demo $ ./helloworld 
Hello World

However if I were to change the owner, to say, pepin, we get the following error:

  aviv@saddleback: demo $ chown pepin helloworld
chown: changing ownership of ‘helloworld’: Operation not permitted

Consider why this might be. If any user can change the ownership of a file, then they could potentially upgrade or downgrade the permissions of files inadvertently, violating a security requirement. As such, only the super user, or the administrator, can change ownership settings.