C compilation options with gcc:
none or -std=gnu89 // GNU 89
-ansi or -std=c89 // ANSI, ISO C90
-std=c99 // ISO C99
-std=gnu99 // GNU 99

Chapter 1: A Tutorial Introduction

A C program consists of functions and variables.  A function contains statements that specify the computing operations to be done, and variables store values to be used during the computation.

Hello World example

#include <stdio.h>

int main(void) {
     printf("hello, world\n");
}

In C, a program begins executing at the beginning of the main function.  Main will usually call other functions to help perform its job (some you wrote, others from libraries provided for you).

Chapter 2: Types, Operators and Expressions

Variables and constants are the basic data objects manipulated in a program.  Declarations lists the variables to be used, and state their type (and perhaps initial value). Operators specify what is to be done to the variables. Expressions combine variables and constants to produce new values.  The type of an object determines the set of values it can have and what operations can be performed on it.

2.1 Variables Names

Made up of letters and digits, first character must be a letter. _ is a letter (can be used for long names, although not the first character). Upper and lower case are distinct: x is a different name to X.  Convention uses lower case for variables and all upper case for constants.  Can’t use keywords like if, else, int etc.  Tend to use short names for local variables (especially loop indices) and longer names for external variables.

2.2 Data Types and Sizes

type description
char a single byte, capable of holding one character in the local character set
int an integer, typically natural size of integers on host machine
float single-precision floating point
double double-precision floating point
short int sh a short integer. ‘int’ here can and is typically omitted
long int counter a long integer. ‘int’ here can and is typically omitted
long double extended-precision floating point

int normally natural size of a particular machine. short is often 16 bits, long 32 bits and int either 16 or 32 bits.

Qualifier signed or unsigned may be applied to char or any integer.  Unsigned numbers are always positive or zero, and obey the laws of arithmetic modulo 2n where n is the number of bits in the type.

Arrays
type arrayName[arraySize];
Unlike structs, arrays are not first-class objects; they cannot be assigned or compared using single built-in operators.  The size of the array must be specified to set aside storage.

The most common type of array in C is the array of characters.

Strings
A sequence of characters in double quotes is called a character string or string constant. Strings are not a separate data type, but are conventionally implemented as null-terminated ('\0', called NUL in ASCII) arrays of characters:
h e l l o \n \0

Structures
struct means a structure: a complex data type declaration that defines a physically grouped list of variables to be placed under one name in a block of memory, allowing the different variables to be accessed via a single pointer, or the struct declared name which returns the same address.

unions

pointers to types

functions that return types (or nothing if return void)

Enumerated types
enum keyword means enumerated types: data type consisting of a set of named values called elementsmembers or enumerators of the type. They are not tagged, and are freely interconvertible with integers.

Character Input and Output

A text stream is a sequence of characters dividend into lines: each line consists of zero or more characters followed by a newline character.

Symbolic Constants

Symbolic constants are a way to pull out constant numbers from a program to make easier to understand.

A #define line defines a symbolic name or symbolic constant to be a particular string of characters:
#define name replacement text

The name has the same form as a variable name: a sequence of letters and digits that begins with a letter. The replacement text can be any sequence of characters; it is not limited to numbers.

#define is a directive to the C preprocessor which manipulates the source file before the C compiler accesses the files (see C pipeline).

Symbolic constant names are conventionally written in CAPS to distinguish from variable names.

2.3 Constants

type example comment
int constant 1234
long constant 123456789L suffix l (ell) or L
Unsigned constant suffix u or U
Unsigned long constant suffix ul or UL
Floating-point constants - type double 123.4 or exponent (1e-2) type is double unless suffixed
float constant 123.4 or exponent (1e-2) suffix f or F
long double suffix l or L
octal value of an integer constant (instead of decimal) 037 (= decimal 31) leading 0.   Can have suffix L for long and U for unsigned.
hexadecimal value of an integer constant (instead of decimal) 0x1f or 0X1F (=decimal 31) leading 0x or 0X means hexadecimal.  Can have suffix L for long and U for unsigned.
character constant is an integer, written as one character in single quotes 'x' value is the numeric value of the character in the machine’s character set.  E.g. in ASCII character set the character constant ‘0’ has value 48.
An arbitrary byte-sized bit pattern '\ooo' where is ooo is one to three octal digits (0… 7) or '\xhh’ where hh is one or more hexadecimal digits (0…9, a…f, A…F) 0XFUL is unsigned long constant with value 15 decimal.

Character Constants

A character written between single quotes represents an integer value equal to the numerical value of the character in the machine’s character set, called a character constant.

Escape Sequences

Complete set of escape sequences.

escape description
\a alert (bell) character
\b backspace
\f formfeed
\n newline
\r carriage return
\t horizontal tab
\v vertical tab
\\ backslash
? question mark
' single quote
" double quote
\ooo octal number
\xhh hexadecimal number

A constant expression is an expression that involves only constants.  Such expressions may be evaluated during compilation rather than run-time.

A string constant or string literal is a sequence of zero or more character surrounded by double quotes: "I am a string".  Technically, a string constant is an array of characters, internally with '\0' at the end (storage requirement is therefore one more than number of characters between the quotes.

The enumeration constant is a list of constant integer values.  The first name in an enum has value 0, the next 1 and so on.  Enumerations provide a convenient way to associate constant values with names.

enum boolean { NO, YES };

enum months { JAN = 1, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, DEC }; // FEB is 2, MAR is 3, etc.

2.4 Declarations

All variables must be declared before use, usually at the beginning of the function before any executable statements.  A declarations specified a type, and contains a list of one or more variables of that type:

int lower, upper, step; // or on individual lines, e.g for comments
char c, line[1000];

Variables may also be initialised in its declaration (if followed by equals sign and an expression, expression serves as the initialiser).

char esc = '\\\';

External and static variables are initialised to zero by default.

Const can be applied to any variable to specify its value will not change.

const char msg[] = "warning:";
int strlen(const char[]); // the function does not change the array

2.5 Arithmetic Operators

Binary arithmetic operators are +, -, *, / and modulo operator %.

% operator cannot by applied to float or double.

Integer division truncates: any fractional part is discarded.

2.6 Relational and Logical Operators

Relational operators are >, >=, <, <=

&& and || are evaluated left to right, and evaluation stops as soon as the truth or falsehood result is known.

The numeric value of a relations or logical expression is 1 if true, or 0 if false.

Unary negation operator ! converts a non-zero operand into 0

Logical AND operator: && if both the operands are non-zero, then the condition becomes true.
Logical OR operator: || if any of the two operands is non-zero, then the condition becomes true.
Logical NOT operator: ! used to reverse the logical state of its operand. If a condition is true, then Logical NOT operator will make it false.

2.7 Type Conversions

When an operator has operands of different types, they are converted to a common type according to a small number of rules:

  • if either operand is long double, convert the other to long double
  • otherwise, if either operand is double, convert the other to double
  • otherwise, if either operand is float, convert the other to float
  • otherwise, convert char and short to int
  • then, if either operand is long, convert the other to long

If an arithmetic operator has one floating point operand and one integer operand, the integer will be converted to floating point before the operation is done.

Explicit type conversions can be forced in any expression, with a unary operator called cast:

2.8 Increment and Decrement Operators

++ and --
Prefix ++n increments n before its value is used
Suffix n++ increments n after is value has been used

2.9 Bitwise Operators

C provides six operators for bit manipulation; these may only be applied to integral operands, that is char, short, int and long whether signed or unsigned:

& bitwise AND
| bitwise inclusive OR
^ bitwise exclusive OR
<< left shift
>> right shift
~ one's complement (unary)

2.10 Assignment Operators and Expressions

+= is called an assignment operator.
i = i + 2 can be written i +=2. I.e. increment i by 2 (c/f take i, add 2, then put the result back in i).

2.11 Conditional Expressions

expr1 ? expr2 : expr3

Expr1 is evaluated first.  If it is true (non-zero), then expr2 is evaluated, and that is the value of the conditional expression.  Otherwise, Expr3 is evaluated and that is the value.

2.12 Precedence and Order of Evaluation

Decreasing order of precedence i.e. * / % the same precedence and higher than + and -

Chapter 3: Control Flow

The control-flow statement of a language specify the order in which computations are performed.

3.1 Statements and Blocks

In C, the semicolon is a statement terminator.  Braces { and } are used to group declarations and statements together into compound statement or block, so that they are syntactically equivalent to a single statement.  There is no semicolon after the right brace that ends a block.

3.2 If-Else

if-else is used to express decisions.

if (expression)
  statement1
else if (expression)
  statement
else
  statement2

The expression is evaluated; if true (non-zero value), statement1 is executed. Otherwise statement2 is executed instead.

In nested if-else statements, the else is associated with the closest previous else-less if.

3.3 Else-if

The expressions are evaluated in order; if any expression is true, the statement associated with it is executed and this terminates the whole chain.

3.4 Switch

Switch statement is a multi-way decision that test whether an expression matches one of a number of constant integer values, and branches accordingly.

switch (expression) {
  case const-expr: statements
  case const-expr: statements
  default: statements
}

Default is optional and is executed if none of the other cases are satisfied.  Because cases serve just as labels, execution falls through to the next unless explicitly escape with break or return.

3.5 Loops - While and For

The while loop executes the statements in the body of the loop while the condition is true.  The body of a while loop can be one or more statements enclosed in braces, or a single statement without braces.

In the for loop, the first part, the initialisation is done once, before the loop is entered.  The second part is the test or condition that controls that loop. If it is true, the body of the loop is executed.  Then the increment step is executed, and the condition is re-evaluated.  The loop terminates if the condition has become false.  The initialisation, condition and increment can be any expressions.

In C, a for loop must have a body.

while (expression)
  statement

Statement is executed and expression is re-evaluated until expression becomes zero, at which point execution resumes after statement.

for (expr1; expr2; expr3)
  statement

is equivalent to

expr1;
while (expr2) {
  statement
  expr3;
}

Any three parts of the for loops may be omitted, although the ; must remain. If expr2 is not present, it is taken as permanently true (an infinite loop) unless broken by other means.

3.6 Loops - Do-while

do
  statement
while (expression);

do-while test at the bottom after making each pass through the loop body: the body is always executed at least once. The statement is executed, then expression is evaluated.  If is is true, statement is evaluated again, and so on.

3.7 Break and Continue

break statement provides an early exit for for, while, do and switch. The continue statement causes the next iteration of the enclosing for, while or do loop to begin.

3.8 Goto and Labels

A label as the same form as a variable name, and is followed by a colon.

for ( … )
  for ( … ) {
    ...
    if (disaster)
      goto error;
    }
    ...
error:
  clean up mess

Generally code that relies on goto statement is harder to understand and maintain than code without.  Use rarely if at all.

Chapter 4: Function and Program Structure

Functions break large computing tasks into smaller ones, and enable people to build on what others have done without starting from scratch.

4.1 Basics of Functions

A function provides a convenient way to encapsulate some computation, which can then be used without worrying about its implementation:

return-type function-name (parameter declaration, if any)
{
  declarations
  statements
}

In C, the called function cannot directly alter a variable in the calling function: it can only alter its private, temporary copy.

One method for communicating data between functions is for the calling function to provide a list of values (called arguments) to the function it calls. The parenthesis after the function name surround the argument list.

Source files may be compiled separately and loaded together, along with previously compiled functions from libraries. Control over which functions and data objects are visible to other files is provided with static and extern attributes.

The return statement is the mechanism for returning a value from the called function to its caller.  Any expression can follow return.  The calling function is free to ignore the returned value.  With just the return keyword, no value is returned to the caller.

cc main.c getline.c strindex.c

Compiles three files, placing output in main.o, getline.o and strindex.o, then loads all into executable files a.out.
.c for source files, .o for object files.

4.2 Functions Returning Non-integers

If not returning an int, function must state the type of value it returns:

double atof(char s[])
{
...
}

The calling declaration must declare the type too:
double someVar, atof(char []);

4.3 External Variables

A C program consists of external objects, which are either variables or functions. External is used in contrast to internal which describes the arguments and variables defined inside functions.

Functions themselves are always external because C does not allow functions to be defined inside other functions.

Variables can be defined external to all function.  That is, variables that can be accessed by name by any function.  Because external variables are globally accessible, they can be used instead of argument lists to communicate data between functions.

An external variable must be defined, exactly once, outside any function: this sets aside storage for it (definition = the place where the variable is created or assigned storage).  The variable must also be declared in each function that wants to access it; one way to do this is with the extern declaration prior (declaration = places where the nature of the variable is stated but no storage is allocated).

Common practice is to place definitions of all external variables at the beginning of the source file, and then  omit all extern declarations.

4.4 Scope Rules

Variables are block scoped.

Variables declared within a function are private or local to that function (scoped) and no other function can have direct access to them.

A variable in a function comes into existence only when the function is called, and disappears when the function is exited.  Because variables come and go with function invocation, they do not retain their values from one call to the next, and must be explicitly set upon each entry.  If they are not set, they will contain garbage.

4.5 Header Files

Often common external material placed on a header file which can then be included in other files of program.

4.6 Static Variables

The static declaration applied to an external variable or function limits the scope of that object to the rest of the source file being compiled.

4.7 Register Variables

A register declaration advises the compiler that the variable in question will be heavily used. This can be used for optimisation.

4.8 Block Structure

Variables can be defined in a block-structured fashion within a function.

4.9 Initialisation

In the absense of explicit initialisation, external and static variables are guaranteed to be initialised to zero, automatic and register variables have undefined initial values.

4.10 Recursion

Functions may be used recursively.

4.11 The C Preprocessor

The C Preprocessor provides #include and #define. Conditional inclusion is allowed:

#if !defined(HDR)
#define HDR

/* contents of hdr.h */

#endif

Chapter 5: Pointers and Arrays

A pointer is a variable that contains the address of a variable. Pointers and arrays are closely related.

5.1 Pointers and Addresses

A typical machine has an array of consecutively numbered or addressed memory cells that may be manipulated individually or in contiguous groups.

A pointer is a group of cells (often two or four) that can hold an address.

The unary operator & gives the address of an object, so:
p = &c
assigns the address of c to the variable p, and p is said to "point" to c.

The unary operator * is the indirection or dereferencing operator. When applied to a pointer, it accessses the object the pointer points to.

int x = 1, y = 2, z[10];
int *ip; /* ip is a pointer to int */

ip = &x; /* ip now points to x */
y = *ip; /* y is now 1 */
*ip = 0; /* x is now 0 */
ip = &z[10]; /* ip now points to z[0] */

Every pointer points to a specific data type (one exception is a "pointer to void" which is used to hold any type of pointer but cannot be dereferenced itself).

5.2 Pointers and Function Arguments

Since C passes arguments to functions by value, there is no direct way for the called function to alter a variable in the calling function.

Cannot call a function to mutate values as mutates only copies. To mutate values need to call function with pointers to the values to be mutated:
swap(&a, &b)

5.3 Pointers and Arrays

Any operation that can be achieved by array subscripting can also be done with pointers. The pointer version will in general by faster.

a[i] refers to the i-th element of the array.

A pointer is a variable. But an array name is not.

5.4 Address Arithmetic

5.5 Character Pointers and Functions

5.6 Pointer Arrays: Pointers to Pointers

Since pointers are variables themselves, they can be stored in arrays just as variables can.

5.7 Multi-dimensional Arrays

5.8 Initialisation of Pointer Arrays

5.9 Pointers vs. Multi-dimensional Arrays

5.10 Command-line Arguments

5.11 Pointers to Functions

5.12 Complicated Declarations

Chapter 6: Structures

A structure is a collection of one or more variables, possibly of different types, grouped together under a single name for convenient handling.

6.1 Basics of Structures

struct point {
  int x;
  int y;
};

The keyword struct introduces structure declaration, including an optional name (point above).

The variable names in a structure are called members. A struct declaration defines a type. The right brace that terminates the list of members my be followed by a list of variables, just as any basic type.

A member of a structure is referred to in an expression by form structure-name.member.

6.2 Structures and Functions

The only legal operations on a structure are copying or assigning it as a unit, taking its address with &, and accessing its members.

6.3 Arrays of Structures

6.4 Pointers to Structures

6.5 Self-referential Structures

6.6 Table Lookup

6.7 Typedef

C provides a facility called typedef for creating new data type names.

For example, typedef int Length makes the name Length a synonym for int.

6.8 Unions

A union is a variable that may hold objects of different types and sizes, with the compiler keeping track of size and alignment requirements. Unions provide a way to manipulate different kinds of data in a single area of storage.

6.9 Bit-fields

Chapter 7: Input and Output

Input and output functions of standard library.

Chapter 8: The UNIX System Interface

UNIX OS provides its services through set of system calls, which are in effect functions within the operating system that may be called by user programs.


Standard IO header: #include <stdio.h>;

Limits header: #include <limits.h>;

Float header: #include <float.h>;

Math header: #include <math.h>;

Type header: #include <ctype.h>;

EOF: int defined in <stdio.h>

getchar: reads the next input character from a text stream and returns that as its value.
c = getchar() // c contains the next character of input

pow(x,y): computes x to the power y

printf: first argument is a string of characters to be printed, with each % indicating where one of the other (second, third .. ) arguments to be substituted, and in what form to be printed. %d specifies an integer argument. The first argument is paired with the subsequent, and must match up in number and type.

%d print as decimal integer
%6d print as decimal integer, at least 6 characters wide %f print as floating point
%6f print as floating point, at least 6 characters wide
%.2f print as floating point, 2 characters after decimal point
%6.2f print as floating point, at least 6 wide and 2 after decimal point
%o print as octal
%x print as hexadecimal
%c print as character
%s print as string
%% print %
%ld print long integer

putchar(c): prints a character each time it is called

scanf: like printf, but reads input instead of writing output.

strlen(s): returns the length of its character string argument (s), excluding the terminal ‘\0’.