Lecture 06: C Strings, String Library, and Pointer Arithmetic
Table of Contents
1 Review Arrays in C
In this lesson we will be discussing C Strings, which are essentially
arrays of char
's that are NULL
terminated. Due to the close link
between arrays, strings, and pointers, it's good to review some that
material first before diving into the nuances of C Strings.
Recall that an array is a contiguos region of memory that stores a
sequence of the same data items We declare arrays statically using
the [ ]
symbols and a size, and you can also reference and assign
to an array using the [ ]
symbol.
int array[10]; int i; for(i=0;i < 10; i++){ array[i] = i*2; }
Additionally, arrays and pointers are closely linked, and, in fact, an array variable is a special type of pointer whose value cannot change. When you declare an array:
int array[10];
You are asking C to do two things. First, this is a request to
allocate 10 integers of memory, contiguously, or 40 bytes. The second
part is to assign the address of that memory allocation to the
variable array
and make it constant so that the value that array
references cannot change.
Essentially, array
is a pointer to the contiguous memory. We can
then access the individual integers in that memory region using the
[ ]
operator. But, we also know that this operation is equivalent
to a deference.
.---. array --> | | array[0] == *(array+0) +---+ | | array[1] == *(array+1) +---+ | | array[2] == *(array+2) +---+ : : etc. ' '
When you index into an array, you are effectively following
the pointer plus the index. That is, the operation of array[i]
says
to following the pointer referenced by the variable array
, move i
steps further, and then return the value found at that memory
location. The concept of pairing arrays and pointers in this style is
called pointer arithmetic and is an incredibly powerful tool of C
programming and used a lot with C strings.
2 C Strings
A string in C is simply an array of char
objects that is null
terminated. Here's a typical C string declaration:
char str[] = "Hello!"
A couple things to note about the declaration:
- First that we declare
str
like an array, but we do not provide it a size.
- Second, we assign to
str
a quoted string. - Finally, while we know that strings are
NULL
terminated, there is no explicitNULL
termination.
We will tackle each of these in turn below.
2.1 Advanced Array Declarations
While the declaration looks acquired at first without the array size, this actually means that the size will be determined automatically by the assignment. All arrays can be declared in this static way; here is an example for an integer array:
int array[] = {1, 2, 3};
In that example, the array values are denoted using the { }
and
comma separated within. The length of the array is clearly 3, but
the compiler can determine that by inspecting the static
declaration, so it is often omitted. However, that does not mean you
cannot provide a size, for example
int array[10] = {1, 2, 3};
is also perfectly fine but has a different semantic meaning. The first declaration (without a size) says allocate only enough memory to store the statically declared array. The second declaration (with the size) says to allocate enough memory to store size items of the data type and initialize as many possible to this array.
You can see this actually happening in this simple program:
/*array_deceleration.c*/ #include <stdio.h> #include <stdlib.h> int main(int argc, char * argv[]){ int a[] = {1,2,3}; int b[10] = {1,2,3}; int i; printf("sizeof(a):%d sizeof(b):%d\n", (int) sizeof(a), (int) sizeof(b) ); printf("\n"); for(i=0;i<3;i++){ printf("a[%d]: %d\n", i,a[i]); } printf("\n"); for(i=0;i<10;i++){ printf("b[%d]: %d\n", i,b[i]); } }
aviv@saddleback: demo $ ./array_decleration sizeof(a):12 sizeof(b):40 a[0]: 1 a[1]: 2 a[2]: 3 b[0]: 1 b[1]: 2 b[2]: 3 b[3]: 0 b[4]: 0 b[5]: 0 b[6]: 0 b[7]: 0 b[8]: 0 b[9]: 0
As you can see, both decelerations work, but the allocation sizes are
different. Array b
is allocated to store 10 integers with a size of
40 bytes, while array a
only allocated enough to store the static
declaration. Also note that the allocation implicitly filled in 0
for non statically declared array elements in b
, which is behavior
you'd expect.
2.2 The quoted string declaration
Now that you have a broader sense of how arrays are declared, let's
adapt this to strings. The first thing we can try and declare is a
string, that is an array of char
's, using the declaration like we
had above.
char a[] = {'G','o',' ','N','a','v','y','!'}; char b[10] = {'G','o',' ','N','a','v','y','!'};
Just as before we are declaring an array of the given type which is
char
. We also use the static declaration for arrays. At this point
we should feel pretty good — we have a string, but not
really. Let's look at an example using this declaration:
/*string_declerations.c*/ #include <stdio.h> #include <stdlib.h> int main(int argc, char * argv[]){ char a[] = {'G','o',' ','N','a','v','y','!'}; char b[10] = {'G','o',' ','N','a','v','y','!'}; int i; printf("sizeof(a):%d sizeof(b):%d\n", (int) sizeof(a), (int) sizeof(b) ); printf("\n"); for(i=0;i<8;i++){ //print char and ASCII value printf("a[%d]: %c (%d)\n", i,a[i],a[i]); } printf("\n"); for(i=0;i<10;i++){ //print char and ASCII value printf("b[%d]: %c (%d) \n", i,b[i],b[i]); } printf("\n"); printf("a: %s\n",a); //format print the string printf("b: %s\n",b); //format print the string }
aviv@saddleback: demo $ ./string_declerations sizeof(a):8 sizeof(b):10 a[0]: G (71) a[1]: o (111) a[2]: (32) a[3]: N (78) a[4]: a (97) a[5]: v (118) a[6]: y (121) a[7]: ! (33) b[0]: G (71) b[1]: o (111) b[2]: (32) b[3]: N (78) b[4]: a (97) b[5]: v (118) b[6]: y (121) b[7]: ! (33) b[8]: (0) b[9]: (0) a: Go Navy!?@ b: Go Navy!
First observations is the sizeof the arrays match our
expectations. A char
is 1 byte in size and the arrays are
allocated to match either the implicit size (7) or the explicit size
(10). We can also print the arrays iteratively, and the ASCII values
are inset to provide a reference. However, when we try and format
print the string using the %s
format, something strange happens
for a
that does not happen for b
.
The problem is that a
is not NULL
terminated, that is, the last
char
numeric value in the string is not 0. NULL
termination is
very important for determining the length of the string. Without
this special marker, the printf()
function is unable to determine
when the string ends, so it prints extra characters that are not
really part of the string.
We can change the declaration of a
to explicitly NULL
terminate
like so:
char a[] = {'G','o',' ','N','a','v','y','!', '\0'};
The escape sequence ='\0'= is equivalent to NULL
, and now we have
a legal string. But, I think we can all agree this is a really
annoying way to do string declarations using array formats because
all strings should be NULL
terminated anyway. Thus, the double
quoted string shorthand is used.
char a[] = "Go Navy!";
The quoted string is the same as statically declaring an array with
an implicit NULL
termination, and it is ever so much more
convenient to use. You can also more explicitly declare the size, as
in the below example, which declares the array of the size, but also
will NULL terminate.
#include <stdio.h> #include <stdlib.h> int main(int argc, char * argv[]){ char a[] = "Go Navy!"; char b[10] = "Go Navy!"; int i; printf("sizeof(a):%d sizeof(b):%d\n", (int) sizeof(a), (int) sizeof(b) ); printf("\n"); for(i=0;i<9;i++){ //print char and ASCII value printf("a[%d]: %c (%d)\n", i,a[i],a[i]); } printf("\n"); for(i=0;i<10;i++){ //print char and ASCII value printf("b[%d]: %c (%d) \n", i,b[i],b[i]); } printf("\n"); printf("a: %s\n",a); //format print the string printf("b: %s\n",b); //format print the string }
aviv@saddleback: demo $ ./string_quoted sizeof(a):9 sizeof(b):10 a[0]: G (71) a[1]: o (111) a[2]: (32) a[3]: N (78) a[4]: a (97) a[5]: v (118) a[6]: y (121) a[7]: ! (33) a[8]: (0) b[0]: G (71) b[1]: o (111) b[2]: (32) b[3]: N (78) b[4]: a (97) b[5]: v (118) b[6]: y (121) b[7]: ! (33) b[8]: (0) b[9]: (0) a: Go Navy! b: Go Navy!
You may now be wondering what happens if you do something silly like this,
char a[3] = "Go Navy!";
where you declare the string to be of size 3 but assign a string requiring much more memory? Well … why don't you try writing a small program to finding out what happen, which you will do in homework.
2.3 String format input, output, overflows, and NULL
deference's:
While strings are not basic types, like numbers, they do have a special place in a lot of operations because we use them so commonly. One such place is in formats.
You already saw above that %s
is the format character to process
a string, and it is also the format character used to scan a
string. We can see how this all works using this simple example:
/*format_string*/ #include <stdio.h> #include <stdlib.h> int main(int argc, char *argv[]){ char name[20]; printf("What is your name?\n"); scanf("%s",name); printf("\n"); printf("Hello %s!\n",name); }
There are two formats. The first will ask the user for their name,
and read the response using a scanf()
. Looking more closely, when
you provide name
as the second argument to scanf()
, you are
saying: "Read in a string and write it to the memory referenced by
name
." Later, we can then print name
using a %s
in a
printf()
. Here is a sample execution:
aviv@saddleback: demo $ ./format_string What is your name? Adam Hello Adam!
That works great. Let's try some other input:
aviv@saddleback: demo $ ./format_string What is your name? Adam Aviv Hello Adam!
Hmm. That didn't work like expected. Instead of reading in the whole
input "Adam Aviv" it only read a single word, "Adam". This has to do
with the functionality of scanf()
that "%s" does not refer to an
entire line but just an individual whitespace separated string.
The other thing to notice is that the string name
is of a fixed
size, 20 bytes. What happens if I provide input that is longer
… much longer.
aviv@saddleback: demo $ ./format_string What is your name? AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAdam Hello AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAdam! *** stack smashing detected ***: ./format_string terminated Aborted (core dumped)
That was interesting. The execution identified that you overflowed the string, that is tried to write more than 20 bytes. This caused a check to go off, and the program to crash. Generally, a segmentation fault occurs when you try to read or write invalid memory, i.e., outside the allowable memory segments.
We can go even further with this example and come up with a name sooooooo long that the program crashes in a different way:
What is your name? AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Hello AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA! Segmentation fault (core dumped)
In this case, we got a segmentation fault. The scanf()
wrote so
far out of bounds of the length of the array that it wrote memory it
was not allowed to do so. This caused the segmentation fault.
Another way you can get a segmentation fault is by dereferencing
NULL
, that is, you have a pointer value that equals NULL
and you
try to follow the pointer to memory that does not exist.
/*null_print.c*/ #include <stdio.h> #include <stdlib.h> int main(int argc,char*argv[]){ printf("This is a bad idea ...\n"); printf("%s\n",(char *) NULL); }
aviv@saddleback: demo $ ./null_print This is a bad idea ... Segmentation fault (core dumped)
This example is relatively silly as I purposely dereference NULL
by trying to treat it as a string. While you might not do it so
blatantly, you will do something like this at some point. It is a
mistake we all make as programmers, and it is a particularly
annoying mistake that is inevitable when you program with pointers and
strings. It can be frustrating, but we will also go over many ways
to debug such errors throughout the semester.
3 Sting Library Functions
Working with strings is not as straight forward as it is in C++ because they are not basic types, but rather arrays of characters. Truth be told, in C++ they are also arrays of characters; however, C++ provides a special library that overloads the basic operations so you can treat C++ strings like basic types. Unfortunately, such conveniences are not possible in C.
As a result, certain programming paradigms that would seem obvious to do in C do not do as you would expect them to do. Here's an example:
/*string_badcmp.c*/ #include <stdio.h> #include <stdlib.h> int main(int argc,char *argv[]){ char str[20]; printf("Enter 'Navy' for a secret message:\n"); scanf("%s",str); if( str == "Navy"){ printf("Go Navy! Beat Army!\n"); }else{ printf("No secret for you.\n"); } }
And if we run this program and enter in the appropriate string, we do not get the result we expect.
aviv@saddleback: demo $ ./string_badcmp Enter 'Navy' for a secret message: Navy No secret for you.
What happened? If we look at the if statement expression:
if( str == "Navy" )
Our intuition is that this will compare the string str
and "Navy"
based on the values in the string, that is, is str
"Navy" ? But
that is not what this is doing because remember a string is an array
of characters and an array is a pointer to memory and so the
equality is check to see if the str
and "Navy" are stored in the
same place in memory and has nothing to do with the actual strings.
To see that this is case, consider this small program which also does not do what is expected:
/*string_badequals.c*/ #include <stdio.h> #include <stdlib.h> int main(int argc, char *argv[]){ char s1[]="Navy"; char s2[]="Navy"; if(s1 == s2){ printf("Go Navy!\n"); }else{ printf("Beat Army!\n"); } printf("\n"); printf("s1: %p \n", s1); printf("s2: %p \n",s2); }
aviv@saddleback: demo $ ./string_badequals Beat Army! s1: 0x7fffe43994f0 s2: 0x7fffe4399500
Looking closely, although both s1
and s2
reference the same
string values they are not the same string in memory and have two
different addresses. (The %p
formats a memory address in
hexadecimal.)
The right way to compare to strings is to compare each character, but that is a lot of extra code and something we don't want to write every time. Fortunately, its been implemented for us along with a number of other useful functions in the string library.
3.1 The string library string.h
To see all the goodness in the string library, start by typing man
string
in your linux terminal. Up will come the manual page for all
the functions in the string library:
STRING(3) Linux Programmer's Manual STRING(3) NAME stpcpy, strcasecmp, strcat, strchr, strcmp, strcoll, strcpy, strcspn, strdup, strfry, strlen, strncat, strncmp, strncpy, strncasecmp, strpbrk, strrchr, strsep, strspn, strstr, strtok, strxfrm, index, rindex - string operations SYNOPSIS #include <strings.h> int strcasecmp(const char *s1, const char *s2); int strncasecmp(const char *s1, const char *s2, size_t n); char *index(const char *s, int c); char *rindex(const char *s, int c); #include <string.h> char *stpcpy(char *dest, const char *src); char *strcat(char *dest, const char *src); char *strchr(const char *s, int c); int strcmp(const char *s1, const char *s2); int strcoll(const char *s1, const char *s2); char *strcpy(char *dest, const char *src); size_t strcspn(const char *s, const char *reject); char *strdup(const char *s); char *strfry(char *string); size_t strlen(const char *s); ...
To use the string library, the only thing you need to do is include
string.h
in the header declarations. You can further explore
different functions string library within their own manual pages. The
two most relevant to our discussion will be strcmp()
and
strlen()
. However, I encourage you to explore some of the others,
for example strfry()
will randomize the string to create an anagram
– how useful!
3.2 String Comparison
To solve our string comparison delimina, we will use the strcmp()
function from the string library. Here is the revelant man page:
STRCMP(3) Linux Programmer's Manual STRCMP(3) NAME strcmp, strncmp - compare two strings SYNOPSIS #include <string.h> int strcmp(const char *s1, const char *s2); int strncmp(const char *s1, const char *s2, size_t n); DESCRIPTION The strcmp() function compares the two strings s1 and s2. It returns an integer less than, equal to, or greater than zero if s1 is found, respectively, to be less than, to match, or be greater than s2. The strncmp() function is similar, except it compares the only first (at most) n bytes of s1 and s2. RETURN VALUE The strcmp() and strncmp() functions return an integer less than, equal to, or greater than zero if s1 (or the first n bytes thereof) is found, respectively, to be less than, to match, or be greater than s2.
It comes in two varieties. One with a maximum length specified and one that relies on null termination. Both return the same values. If the two strings are equal, then the value is 0, if the first string string is greater (larger alphabetically) than it returns 1, and if the first string is less than (smaller alphabetically) then it returns -1.
Plugging in strcmp()
into our secrete message program, we get the
desired results.
/*string_strncp.c*/ #include <stdio.h> #include <stdlib.h> int main(int argc,char *argv[]){ char str[20]; printf("Enter 'Navy' for a secret message:\n"); scanf("%s",str); if( strcmp(str,"Navy") == 0 ) { printf("Go Navy! Beat Army!\n"); }else{ printf("No secret for you.\n"); } }
aviv@saddleback: demo $ ./string_strcmp Enter 'Navy' for a secret message: Navy Go Navy! Beat Army!
3.3 String Length vs String Size
Another really important string library function is strlen()
which returns the length of the string. It is important to
differentiate the length of the string from the size of the string.
- string length: how many characters, not including the null character, are in the string
- sizeof : how many bytes required to store the string.
One of the most common mistakes when working with C strings is to consider the sizeof the string and not the length of the string, which are clearly two different values. Here is a small program that can demonstrate how this can go wrong quickly:
#include <stdio.h> #include <stdlib.h> #include <string.h> int main(int argc, char *argv[]){ char str[]="Hello!"; char * s = str; printf("strlen(str):%d sizeof(str):%d sizeof(s):%d\n", (int) strlen(str), //the length of the str (int) sizeof(str), //the memory size of the str (int) sizeof(s) //the memory size of a pointer ); }
aviv@saddleback: demo $ ./string_length strlen(str):6 sizeof(str):7 sizeof(s):8
Note that when using strlen()
we get the length of the string
"Hello!" which has 6 letters. The size of the string str
is how much
memory is used to store it, which is 7, if you include the null
terminated. However, things get bad when you have a pointer to that
string s
. Calling sizeof()
on s
returns how much memory needed
to store s
which is a pointer and thus is 8-bytes in size. That has
nothing to do with the length of the string or the size of the
string. This is why when working with strings always make sure to use
the right length not the size.
4 Pointer Arithmetic and Strings
As noted many times now, strings are arrays, and as such, you can
work with them as arrays using indexing with [ ]
; however, often
when programmers work with strings, they use pointer arithmetic. For
example, here is a routine to print a string to stdout:
void my_puts(char * str){ while(*str){ putchar(*str); str++; } }
This function my_puts()
takes a string and will write the string,
char-by-char to stdout using the putchar()
function. What might
seem a little odd here is the use of the while loop, so lets unpack
that:
while(*str)
What does this mean? First notice that str
is declared as a char
*
which is a pointer to a character. We also know that pointers and
arrays are the same, so we can say that str
is a string that
references the first character in the string's array. Next the
*str
operation is a dereference, which says to follow the pointer
and retrieve the value that it references. In this case that would
be a character value. Finally, the fact that this operation occurs
in the expression part means that we are testing the value that the
pointer references for not be false, which is the same as asking if
it is not zero or not NULL
.
So, the while(*str)
says to continue looping as long the pointer
str
does not reference NULL
. The pointer value of str
does
change in the loop and is incremented, str++
, for each interaction
after the call to putchar()
.
Now putting it all together, you can see that this routine will
iterate through a string using a pointer until the NULL
terminator
is reached. Phew. While this might seem like a backwards way of
doing this, it is actually a rather common and straight foreword
programming practice with strings and pointers in general.
4.1 Pointer Arithmetic and Types
Something that you might have noticed is that we have been using pointer arithmetic for different types in the same way. That is, consider the two arrays below, one an array of integers and one a string:
int a[] = {0,1,2,3,4,5,6,7}; char str = "Hello!";
Both arrays are the same length, 7, but they are different sizes. Integers are 4-bytes, so to store 7 integers requires 4*7=24 bytes. But characters are 1 byte in size, so to store 7 characters requires just 7 bytes. In memory the two arrays may look something like this:
<------------------------ 24 bytes ----------------------------> .---------------.----------------.--- - - - ---.----------------. a -> | 0 | 1 | | 7 | '---------------'----------------'--- - - - ---'----------------' .---.---.---.---.---.---. str -> | H | e | l | l | o | \0| '---'---'---'---'---'---' <------- 7 bytes ------>
Now consider what happens when we use pointer arithmetic on these arrays to dereference the third index:
/*pointer_math.c*/ #include <stdio.h> #include <stdlib.h> int main(int argc, char *argv[]){ int a[] = {0,1,2,3,4,5,6,7}; char str[] = "Hello!"; printf("a[3]:%d str[3]:%c\n", *(a+3),*(str+3)); }
aviv@saddleback: demo $ ./pointer_math a[3]:3 str[3]:l
Knowing what you know, the output is not consistent. When you add 3
to the array of integers a
, you adjust the pointer by 12 bytes so
that you now reference the value 3. However, when you add 3 to the
string pointer, you adjust the pointer by 3 bytes to reference the
value 'l'.
The reason for this has to do with pointer arithmetic consideration of typing. When you declare a pointer to reference a particular type, C is aware that adding to the pointer value should consider the type of data being referenced. So when you add 1 to an integer pointer, you are moving the reference forward 4 bytes since that is the size of the integer. If we were to print the pointer values (in hex) and do numerical arithmetic we would see this to be true:
printf("a=%p a+3=%p (a+3-a)=%d\n",a,a+3, ((long) (a+3)) - (long) a); printf("str=%p str+3=%p (str+3-str)=%d\n",str,str+3, ((long) (str+3)) - (long) str);
aviv@saddleback: demo $ ./pointer_math a[3]:3 str[3]:l a=0x7fffa5c4d260 a+3=0x7fffa5c4d26c (a+3-a)=12 str=0x7fffa5c4d280 str+3=0x7fffa5c4d283 (str+3-str)=3
In the first part a+3
changed the pointer value by 0xc in hex which
is 12, while str+3
only changes the character value by 0x3 or 3
bytes. More starkly you can see that if we treat the pointer values
as longs and do numerical arithmetic after doing pointer arithmetic
you see this more clearly.
4.2 Character Arrays as Arbitrary Data Buffers
Now you may be wondering, how do I access the individual bytes of larger data types? The answer to this is the final peculiarity of character arrays in C.
Consider that a char
data type is 1 byte in size, which is the
smallest data element we work with as programmers. Now consider that
an array of char
's matches exactly that many bytes. So when we
write something like:
char s[4];
What we are really saying is: "allocate 4 bytes of data." We like to
think about storing a string of length 3 in that character array with
one byte for the null terminator, but we do not have to. In fact, any
kind of data can be stored there as long as it is only 4-bytes in
size. An integer is four bytes in size. Let's store an integer in
s
.
aviv@saddleback: demo $ cat pointer_casting.c #include <stdio.h> #include <stdlib.h> int main(int argc, char *argv[]){ char s[4]; s[0] = 255; s[1] = 255; s[2] = 255; s[3] = 255; int * i = (int *) s; printf("*i = %d\n", *i); }
What this program does is set all the bytes in the character array to 255, which is the largest value 1-byte can store. The result is that we have 4-bytes of data that are all 1's, since 255 in binary is 11111111. Four bytes of data that is all 1's. Next, consider what happens with this cast:
int * i = (int *) s;
Now the pointer i
references the same memory as s
, which is
4-bytes of 1's. What's different is that i
is an integer pointer
not a character pointer. That means the 4-bytes of 1's is an integer
not characters from the perspective of i
. And when we dereference
i
to print those bytes as a number, we get:
aviv@saddleback: demo $ ./pointer_casting *i = -1
Which is the signed value for all 1's (remember two's compliment?). What we've just done is use characters as a generic container for data and then used pointer casting to determine how to interpret that data. This may seem crazy — it is — but it is what makes C so low level and useful.
We often refer to character arrays as buffers because of this property of being arbitrary containers. A buffer of data is just a bunch of bytes, and a character array is the most direct way to access that data.