Lecture 02: Globing, Ownership, Groups, and Permissions
Table of Contents
1 Pattern Matching in Unix File System
When working in the command line terminal, a common desire is to
perform operations over many files at once. For example, maybe we
want to print the contents using cat
of all files that start with
the name "file". If the directory contained many hundreds of such
files, then writing out each file name by hand would be terrible.
To alleviate such annoyances, the unix shell provides a simple
mechanisms for describing pattern matching. The process of
describing patterns to match a string is well studied in computer
science and is the basis of many theoretical thought. In this class
we will apply pattern matching in the shell to perform simple tasks,
and, in fact, you've seen some of this already when using grep
in
the last lab. Generally, we will use two pattern matching
techniques:
- Globbing: A shell driven pattern matching facility that allows the user to generally matching patterns in files and directories.
- Regular Expressions : An expressive language of pattern
matching that can match contents and names of files and
directories through other command line tools, like
sed
andgrep
.
In this lecture, we will focus primarily on globbing, and in the lab, you will have some basic exposure to regular expressions.
2 Globbing
The globbing is actually a process of filename expansion which provides a way for the user to refer to multiple files at the same time. The key idea behind globing is that you can use a wildcard that can match anything and the wildcard then expands the command line with all the matched files or directories given the pattern. Think of this like a wildcard in poker: if you have a wildcard, then it can become any card you like. In the same way, when you use a wildcard in a pattern, it can inhabit any character you want to match in some specified way.
There are three primary wildcards used for globbing:
- * – match zero or more
- ? – match exactly one
- [] – match exactly one from the set
- [^ ] – match exactly one thing not from the set
2.1 Match zero or more with *
As an example, consider the contents of this directory with 200 files:
#> ls file.0 file.115 file.132 file.15 file.167 file.184 file.200 file.38 file.55 file.72 file.9 file.1 file.116 file.133 file.150 file.168 file.185 file.21 file.39 file.56 file.73 file.90 file.10 file.117 file.134 file.151 file.169 file.186 file.22 file.4 file.57 file.74 file.91 file.100 file.118 file.135 file.152 file.17 file.187 file.23 file.40 file.58 file.75 file.92 file.101 file.119 file.136 file.153 file.170 file.188 file.24 file.41 file.59 file.76 file.93 file.102 file.12 file.137 file.154 file.171 file.189 file.25 file.42 file.6 file.77 file.94 file.103 file.120 file.138 file.155 file.172 file.19 file.26 file.43 file.60 file.78 file.95 file.104 file.121 file.139 file.156 file.173 file.190 file.27 file.44 file.61 file.79 file.96 file.105 file.122 file.14 file.157 file.174 file.191 file.28 file.45 file.62 file.8 file.97 file.106 file.123 file.140 file.158 file.175 file.192 file.29 file.46 file.63 file.80 file.98 file.107 file.124 file.141 file.159 file.176 file.193 file.3 file.47 file.64 file.81 file.99 file.108 file.125 file.142 file.16 file.177 file.194 file.30 file.48 file.65 file.82 file.109 file.126 file.143 file.160 file.178 file.195 file.31 file.49 file.66 file.83 file.11 file.127 file.144 file.161 file.179 file.196 file.32 file.5 file.67 file.84 file.110 file.128 file.145 file.162 file.18 file.197 file.33 file.50 file.68 file.85 file.111 file.129 file.146 file.163 file.180 file.198 file.34 file.51 file.69 file.86 file.112 file.13 file.147 file.164 file.181 file.199 file.35 file.52 file.7 file.87 file.113 file.130 file.148 file.165 file.182 file.2 file.36 file.53 file.70 file.88 file.114 file.131 file.149 file.166 file.183 file.20 file.37 file.54 file.71 file.89
Now Suppose I want to just match all files that begin with file.4
but can end in anything. So the file, file.4
would match as well
as the file file.44
and file.41
. To do so I can use a wildcard,
the * or Asterix like this file.4*
.
#>ls file.4* file.4 file.40 file.41 file.42 file.43 file.44 file.45 file.46 file.47 file.48 file.49
The * symbole semantically says: "Match zero or more items." That
means file.4
matches since there are zero items following the "4",
as well as file.41
matches since there is one item following the
"4", the "1". You can more clearly see that * matches zero or more
items by looking aht the glob file.1*
:
#>ls file.1* file.1 file.109 file.119 file.129 file.139 file.149 file.159 file.169 file.179 file.189 file.199 file.10 file.11 file.12 file.13 file.14 file.15 file.16 file.17 file.18 file.19 file.100 file.110 file.120 file.130 file.140 file.150 file.160 file.170 file.180 file.190 file.101 file.111 file.121 file.131 file.141 file.151 file.161 file.171 file.181 file.191 file.102 file.112 file.122 file.132 file.142 file.152 file.162 file.172 file.182 file.192 file.103 file.113 file.123 file.133 file.143 file.153 file.163 file.173 file.183 file.193 file.104 file.114 file.124 file.134 file.144 file.154 file.164 file.174 file.184 file.194 file.105 file.115 file.125 file.135 file.145 file.155 file.165 file.175 file.185 file.195 file.106 file.116 file.126 file.136 file.146 file.156 file.166 file.176 file.186 file.196 file.107 file.117 file.127 file.137 file.147 file.157 file.167 file.177 file.187 file.197 file.108 file.118 file.128 file.138 file.148 file.158 file.168 file.178 file.188 file.198
2.2 Match exactly 1 with ?
There are situations when you only want to match only 1 character.
For example, suppose we want to only list the files that file.40
through file.49
, but not list file.41
. If the only wildcard we
had was the *, then we would not be able to write a glob for that,
and worse, if the possible strings also contained items like
file.400
, then excluding that would not be possible.
Instead, we need a glob wildcard that can do a limited match. For
that we use a ? wildcard which matchs exactly 1 item. So we can now
write the condition for file.40
through file.49
as the glob
file.4?
.
#>ls file.4? file.40 file.41 file.42 file.43 file.44 file.45 file.46 file.47 file.48 file.49
Notice that file.4
does not match file.4?
because file.4
does
not have a suffix with at least one more character, as necessitated
by the ? wildcard. You can also include the ? within a glob.
#> ls file.1?5 file.105 file.115 file.125 file.135 file.145 file.155 file.165 file.175 file.185 file.195
On your own, what would file.1??
match?
2.3 Match from a set with [ ] and [ ^ ]
Finally, to complete the matching capabilitie of globs, we need a
way to match from a subset of choices. Consider a situation where
you want to match all files matching file.13?
or file.15?
. That
is, the second digit in the file can either be a 3 or a 5, and we
can describe that using a [] wildcard like file.1[35]?
#> ls file.1[35]? file.130 file.132 file.134 file.136 file.138 file.150 file.152 file.154 file.156 file.158 file.131 file.133 file.135 file.137 file.139 file.151 file.153 file.155 file.157 file.159
You can also negate a set, stating matching anything that is not in the [].
#>ls file.1[^35]? file.100 file.108 file.116 file.124 file.142 file.160 file.168 file.176 file.184 file.192 file.101 file.109 file.117 file.125 file.143 file.161 file.169 file.177 file.185 file.193 file.102 file.110 file.118 file.126 file.144 file.162 file.170 file.178 file.186 file.194 file.103 file.111 file.119 file.127 file.145 file.163 file.171 file.179 file.187 file.195 file.104 file.112 file.120 file.128 file.146 file.164 file.172 file.180 file.188 file.196 file.105 file.113 file.121 file.129 file.147 file.165 file.173 file.181 file.189 file.197 file.106 file.114 file.122 file.140 file.148 file.166 file.174 file.182 file.190 file.198 file.107 file.115 file.123 file.141 file.149 file.167 file.175 file.183 file.191 file.199
Note, that set glob is like a ? in that it matches 1 or more, so
the glob file.[1][1][1]
will only match file.111
.
2.4 Subdirectory Matching
Globs can be used to match subdirectories. Consider the following directory layout:
#>ls dir.a/ dir.ab/ dir.ad/ dir.ba/ dir.bc/ dir.be/ dir.ca/ dir.cc/ dir.ce/ dir.da/ dir.dc/ dir.de/ dir.eb/ dir.ed/ dir.aa/ dir.ac/ dir.b/ dir.bb/ dir.bd/ dir.c/ dir.cb/ dir.cd/ dir.d/ dir.db/ dir.dd/ dir.ea/ dir.ec/ dir.ee/
And each directory has the following files, which you can explore with the ls
glob.
#>ls * dir.a: file.0 file.10 file.12 file.14 file.16 file.18 file.2 file.3 file.5 file.7 file.9 file.1 file.11 file.13 file.15 file.17 file.19 file.20 file.4 file.6 file.8 dir.aa: file.0 file.10 file.12 file.14 file.16 file.18 file.2 file.3 file.5 file.7 file.9 file.1 file.11 file.13 file.15 file.17 file.19 file.20 file.4 file.6 file.8 (...)
The file expansion mechanism for a pattern allow the user to match
the entire path. Consider an individual file file.10
suppose we
wanted to match all instances of that file across the
subdirectories, we can use the following pattern:
#> ls */file.10 dir.a/file.10 dir.ad/file.10 dir.bc/file.10 dir.ca/file.10 dir.ce/file.10 dir.dc/file.10 dir.eb/file.10 dir.aa/file.10 dir.b/file.10 dir.bd/file.10 dir.cb/file.10 dir.d/file.10 dir.dd/file.10 dir.ec/file.10 dir.ab/file.10 dir.ba/file.10 dir.be/file.10 dir.cc/file.10 dir.da/file.10 dir.de/file.10 dir.ed/file.10 dir.ac/file.10 dir.bb/file.10 dir.c/file.10 dir.cd/file.10 dir.db/file.10 dir.ea/file.10 dir.ee/file.10
All the directories match the * and thus the pattern refers to
file.10
as exists in each of the subdirectories. These can be
built upon each other in ever more complex ways. For example, here
is a pattern to match all files ending with a 5 or a 0, but not
file.5 or file.0, and only in directories ending in "a."
#> ls dir.*a/file.?[05] dir.a/file.10 dir.aa/file.10 dir.ba/file.10 dir.ca/file.10 dir.da/file.10 dir.ea/file.10 dir.a/file.15 dir.aa/file.15 dir.ba/file.15 dir.ca/file.15 dir.da/file.15 dir.ea/file.15 dir.a/file.20 dir.aa/file.20 dir.ba/file.20 dir.ca/file.20 dir.da/file.20 dir.ea/file.20
3 File Permissions and Ownership chmod
and chown
Continuing our exploration of the UNIX file system and command line operations, we now turn our attention to the file ownership and permissions. One of the most important services that the OS provides is security oriented, ensuring that the right user access the right file in the right way.
Lets first remind ourselves of the properties of a file that are
returned by running ls -l
:
.- Directory? | .-------Permissions .- Directory Name | ___|___ .----- Owner | v/ \ V ,---- Group V drwxr-x--x 4 aviv scs 4096 Dec 17 15:14 ic221 -rw------- 1 aviv scs 400 Dec 19 2013 .ssh/id_rsa.pub ^ \__________/ ^ File Size -------------' | '- File Name in bytes | | Last Modified --------------'
There are two important parts to this discussion: the owner/group and the permissions. The owner and the permissions are directly related to each. Often permissions are assigned based on user status to the file, either being the owner or part of a group of users who have certain access to the file.
3.1 File Ownership and Groups
The owner of a file is the user that is directly responsible for the
file and has special status with respect to the file
permission. Users can also be grouped together in group
, a
collection of users who posses the same permissions. A file also has
a group designation to specify which permission should apply.
You all are already aware of your username. You use it all the time,
and it should be a part of your command prompt. To have UNIX tell
you your username, use the command, who am i
:
aviv@saddleback: ~ $ who am i aviv pts/24 2014-12-29 10:44 (potbelly.academy.usna.edu)
The first part of the output is the username, for me that is aviv
,
for you it will be your username. The rest of the information in
the output refers to the terminal, the time the terminal was
created, and from which host you are connected. We will learn about
terminals later in the semester. (And yes, I name my computers after
pigs.)
You can determine which groups you are in using the groups
command.
aviv@saddleback: ~ $ groups scs sudo
On this computer, I am in the scs
which is for computer science
faculty members. I am also in the sudo
group, which is for users
who have super user access to the machine. Since saddleback is my
personal work computer, I have sudo access.
3.2 The password and group file
Groupings are defined in two places. The first is a file called
/etc/passwd
which manages all the users of the system. Here is my
/etc/passwd
entry:
aviv@saddleback: ~ $ grep aviv /etc/passwd aviv:x:35001:10120:Adam Aviv {}:/home/scs/aviv:/bin/bash
The first two parts of that file describe the userid and
groupid, which are 35001 and 10120, respectively. These numbers
are the actually group and user names, but Unix nicely converts
these numbers into names for our convenience. The translation
between userid and username is in the password file. The translation
between groupid and group name is in the group file,
/etc/group
. Here is the SCS entry in the group file:
aviv@saddleback: ~ $ grep scs /etc/group scs:*:10120:webadmin,www-data,lucas,slack
There you can see that the users webadmin
, www-data
, lucas
and
slack
are also in the SCS group. While my username is not listed
directly, I am still in the scs group as defined by the entry in the
password file.
Take a moment to explore these files and the commands. See what groups you are in.
3.3 File Permissions
We can now turn our attention to the permission string. A permission is simply a sequence of 9 bits broken into 3 octets of 3 bits each. An octet is a base 8 number that goes from 0 to 7, and 3 bits unique define an octet since all the numbers between 0 and 7 can be represtend in 3 bits.
Within an octet, there are three permission flags, read, write
and execute. These are often referred to by their short hand, r
,
w
, and x
. The setting of a permission to on means that the bit
is 1. Thus for a set of possible permission states, we can unique
define it by a octal number
rwx -> 1 1 1 -> 7 r-x -> 1 0 1 -> 5 --x -> 0 0 1 -> 1 rw- -> 1 1 0 -> 6
A full file permission consists of the octet set in order of user, group, and global permission.
,-Directory Bit | | ,--- Global Permission v / \ -rwxr-xr-x \_/\_/ | `--Group Permission | `-- User Permissoin
These define the permission for the user of the file, what users in the same group of the file, and what everyone else can do. For a full permission, we can now define it as 3 octal numbers:
-rwxrwxrwx -> 111 111 111 -> 7 7 7 -rwxrw-rw- -> 111 110 110 -> 7 6 6 -rwxr-xr-x -> 111 101 101 -> 7 5 5
To change a file permission, you use the chmod
command and
indicate the new permission through the octal. For example, in
part5
directory, there is an executable file hello_world
. Let's
try and execute it. To do so, we insert a ./
in the front to tell
the shell to execute the local file.
> ./hello_world -bash: ./hello_world: Permission denied
The shell returns with a permission denied. That's because the execute bit is not set.
#> ls -l hello_world -rw------- 1 aviv scs 7856 Dec 23 13:51 hello_world
Let's start by making the file just executable by the user, the permission 700. And now we can execute the file:
#> chmod 700 hello_world #> ls -l hello_world -rwx------ 1 aviv scs 7856 Dec 23 13:51 hello_world #> ./hello_world Hellow World!
This file can only be execute by the user, not by anyone else because the permission for the group and the world are still 0. To add group and world permission to execute, we use the permission setting 711:
#> chmod 711 hello_world #> ls -l hello_world -rwx--x--x 1 aviv scs 7856 Dec 23 13:51 hello_world
At times using octets can be cumbersome, for example, when you want to set all the execute or read bits but don't want to calculate the octet. In those cases you can use shorhands.
r
,w
,x
shorthands for permission bit read, write and execute- The
+
indicates to add a permission, as in+x
or+w
- The
-
indacetes to remove a permission, as in-x
or-w
u
,g
,a
shorthands for permission bit user, group, and gloabl (or all)
Then we can change the permission
chmod +x file <-- set all the execute bits chmod a+r file <-- set the file world readable chmod -r file <-- unset all the read bits chmod gu+w file <-- set the group and user write bits to true
Depending on the situations, both the octets and the shorthands are preferred.
3.4 Changing File Ownership and Group
The last piece of the puzzle is how do we change the ownership and group of a file. Two commands:
chown user file/directory
: change owner of the file/directory to the userchgrp group file.directory
: change group of the file to the group
Permission to change the owner of a file is reserved only for the super user for security reasons. However, changing the group of the file is reserved only for the owner.
aviv@saddleback: demo $ ls -l total 16 -rwxr-x--- 1 aviv scs 9133 Dec 29 10:39 helloworld -rw-r----- 1 aviv scs 99 Dec 29 10:39 helloworld.cpp aviv@saddleback: demo $ chgrp mids helloworld aviv@saddleback: demo $ ls -l total 16 -rwxr-x--- 1 aviv mids 9133 Dec 29 10:39 helloworld -rw-r----- 1 aviv scs 99 Dec 29 10:39 helloworld.cpp
Note now the hello world program is in the mids group. I can still execute it because I am the owner:
aviv@saddleback: demo $ ./helloworld Hello World
However if I were to change the owner, to say, pepin
, we get the
following error:
aviv@saddleback: demo $ chown pepin helloworld chown: changing ownership of ‘helloworld’: Operation not permitted
Consider why this might be. If any user can change the ownership of a file, then they could potentially upgrade or downgrade the permissions of files inadvertently, violating a security requirement. As such, only the super user, or the administrator, can change ownership settings.