Advanced Programming in the UNIX Environment, 3e#

[ h ] Advanced Programming in the UNIX Environment. 3e.


Table of Contents#


1 - UNIX System Overview#

Unix Architecture#

Unix Architecture

  • Kernel

  • System Calls

  • Library Routines - libraries of common functions are built on top of the system call interface

  • Shell - a special application that provides an interface for running other applications

  • Applications - can use both library routines that access system calls and system calls themselves

Login#

  1. The user enters their login name and password.

  2. The system looks up the user’s login name in its password file /etc/passwd.

  3. An entry in the password file consists of 7 colon-separated fields.

    • login name

    • encrypted password (modern systems have moved the encrypted password to a different file)

    • numeric user ID

    • numeric group ID

    • comment

    • home directory

    • shell program

  4. The working directory is set to the user’s home directory (field 6)

  5. The user’s shell is executed (field 7)

Process ID (PID)

User ID (UID)

  • found in the password file /etc/passwd

  • a unique numeric value that identifies the user to the system

  • assigned by the sysadmin when the user’s login name is assigned

  • immutable?

  • the kernel uses the UID to check whether the user has the appropriate permissions to perform certain ops

  • UID 0 = root or superuser; if a process has superuser privileges then most file permission checks are bypassed; some os functions are restricted to the superuser

Group ID (GID)

  • found in the password file /etc/password

  • a numeric value that collects users together into projects or departments for the purposes of sharing resources among group members

  • assigned by the sysadmin when the user’s login name is assigned

  • the group file /etc/group maps group names to GIDs

the file system stores both the UID and the GID of a file’s owner of every file on disk

  • 4 B = 2 B int (UID) + 2 B int (GID)

  • more disk space would be required if the full ASCII login name and group name were used (strings)

  • comparing strings during permission checks would be more expensive than comparing integers

  • early UNIX systems used 16-bit ints to represent UIDs and GIDs; contemporary systems use 32-bit ints

supplementary GIDs

  • started in 4.2BSD

  • users can belong to up to 16 additional groups

  • supplementary GIDs are obtained at login time by reading the file /etc/group and finding the first 16 entries that list the user as a member

  • POSIX requires that a system support at least 8 supplementary groups per process, but most systems support at least 16

Shells#

/etc/sh   # Bourne
/etc/bash # Bourne-Again
/etc/dash # Debian Almquist
/etc/ksh  # Korn
/etc/csh  # C
/etc/tcsh # TENEX C
/etc/zsh  # Z
/etc/fish # Fish

Files#

Directory Entries

  • logical view - each directory entry contains a filename along with information describing the file’s attributes

  • physical view - the way it is actually stored on disk

  • most implementations of the UNIX file system don’t store attributes in the directory entries themselves because of the difficulty of keeping them in sync when a file has multiple hard links

File Names

  • the only two characters that cannot appear in a file name are the slash / and the null character

    • the slash / separates the file names that form a path name

    • the null character terminates a path name

    • it’s good practice to restrict the characters in a file name to a subset of the normal printing characters: if we use the shell’s special characters in a file name then we have to use the shell’s quoting mechanism to reference the file name; POSIX.1 recommends restricting file names to consist of the following characters: letters a-zA-Z, numbers 0-9, period ., dash -, and underscore _

  • two file names are automatically created whenever a new directory is created

    • . dot refers to the current directory

    • .. dot-dot refers to the parent directory

    • in the root directory /, dot-dot .. is the same as dot .

  • file name size

    • 14 characters: Research UNIX System & UNIX System V systems

    • 255 characters: BSD

    • 255+ today

File Attributes

  • file type (regular file, directory)

  • file size

  • file owner

  • file permissions

  • last modification time

  • last access time

I/O#

Unbuffered I/O

  • functions open, read, write, lseek, and close all work with file descriptors

  • read reads a specified number of bytes

Standard I/O uses header file <stdio.h>

  • standard I/O functions provide a buffered interface to the unbuffered I/O functions

  • functions fgets, printf

  • conveniences

    • don’t have to choose an optimal buffer size via BUFFSIZE

    • simplifies dealing with lines of input

    • control the style of buffering

  • fgets reads an entire line

  • printf

Programs & Processes#

UNIX guarantees that every process has a unique numeric identifier called the process ID (PID) which is always a non-negative integer.

Process Control functions

  • exec

  • fork

  • wait-pid

Threads#

Usually a process has only one thread of control–one set of machine instructions executing at a time.

All threads within a process share the same address space, file descriptors, stacks, and process-related attributes.

Each thread executes on its own stack, although any thread can access the stacks of other threads in the same process.

Since they access the same memory, threads need to synchronize access to shared data among themselves to avoid inconsistencies.

Threads are identified by thread IDs which are local to a process: a thread ID in one process has no meaning in another process.

Error Handling#

Functions that return integers return negative integers to indicate an error and the variable errno is set to one of fifteen possible integer values that indicates the nature of the error; for example, function open returns either a non-negative file descriptor or -1 if an error occurs.

  • file does not exits

  • EACCES permission error, insufficient permission to open the requested file

Some function rely on a convention other than returning a negative number; for example, most functions that return a pointer to an object return a null pointer to indicate an error.

Header file <errno.h> defines errno along with constants beginning with E for each value that it can assume. The first page of Section 2 of the UNIX system manuals named intro(2) lists the error constants. On Linux, the error constants are listed in the errno(3) manual page. Errors are divided into two categories: fatal and nonfatal. Fatal errors have no recovery action; the best that can be done is to print an error message to the user’s screen or to a log file and then exit. Most nonfatal errors are temporary (e.g., resource shortage) and might not occur when there is less activity on the system. The typical recovery action for a resource-related nonfatal error is to delay and retry later.

resource-related nonfatal errors

  • EAGAIN

  • ENFILE

  • ENOBUFS

  • ENOLCK

  • ENOSPC

  • EWOULDBLOCK

  • ENOMEM

  • EBUSY can be treated as nonfatal when it indicates that a shared resource is in use

  • EINTR can be treated as nonfatal when it interrupts a slow system call

errno

  • POSIX and ISO C define errno as a symbol expanding into a modifiable lvalue of type integer (either an integer that contains the error number or a function that returns a pointer to the error number)

  • historical definition: extern int errno;

  • Linux multithreaded access to errno: extern int *__errno_location(void); #define errno (*__errno_location())

two rules

  • the value of errno is never cleared by a routine if an error does not occur; therefore, we should examine its value only when the return value from a function indicates that an error occurred

  • the value of errno is never set to 0 by any of the functions, and none of the constants defined in <errno.h> has a value of 0

C standard functions for printing error messages

/* this function maps the `errno` value `errnum`
 * into an error message string
 * and returns a pointer to the string
 */
#include <string.h>
char *strerror(int errnum);
/* this function produces an error message on STDERR
 * based on the current value of `errno` and returns;
 * it outputs the string pointd to by `msg` followed by
 * a colon and a space, followed by the error message
 * corresponding to the value of `errno`, followed by a newline
 */
#include <stdio.h>
void perror(const char *msg);

Signals#

Signals are a technique to notify a process that some condition has occurred. For example, if a process divides by zero, the signal whose name is SIGFPE (floating-point exception) is sent to the process.

A process has three choices for dealing with a signal

  1. ignore the signal - this isn’t recommended for signals that denote hardware exceptions (e.g., dividing by zero, referencing memory outside the address space of the process) since the results are undefined

  2. let the default action occur (e.g., for a divide-by-zero condition the default action is to terminate the process)

  3. provide a function that is called when the signal occurs (“catching the signal”) and handle it

keyboard

  • Interrupt Ctrl-C

  • Exit Ctrl-\

functions/commands

  • function kill can be called from a process to send a signal to another process (we have to be the owner of the process or the superuser to be able to send this signal)

signals

  • SIGINT

  • SIGFPE

Time#

Calendar Time

  • counts the number of seconds since Epoch: 00:00:00 Jan 1, 1970, UTC

  • these time values are used to record the time a file was last modified, for example

  • the primitive system data type time_t holds these time values

Process Time (CPU Time)

  • measures the central processor resources used by a process

  • process time is measured in clock ticks, which have historically been 50, 60, or 100 ticks per second

  • the primitive system data type clock_t holds these values

Measuring the execution time of a process, UNIX maintains three values for a process

  • (wall) clock time - the amount of time the process takes to run; its value depends on the number of other processes running on the system

  • CPU time - the sum of the user CPU time and the system CPU time

  • user CPU time - the CPU time attributed to the user instructions

  • system CPU time - the CPU time attributed to the kernel when it executes on behalf of the process (e.g., whenever a process executes a system service such as read or write the time spent within the kernel performing that system service is charged to the process)

time(1)

System Calls#

“All operating systems provide service points through which programs request services from the kernel; all implementations of [UNIX] provide a well-defined, limited number of entry points directly into the kernel called system calls.” - Advanced Programming in the UNIX Environment, 3e

  • Version 7 Research UNIX System : about 50 system calls

  • 4.4BSD : about 110 system calls

  • SVR4 : about 120 system calls

  • Linux 3.2.0 : 380 system calls

  • FreeBSD 8.0 : 450+ system calls

System Call Interface documented in Section 2 of the UNIX Programmer’s Manual. It’s definition is in the C language no matter which implementation technique is actually used to invoke a system call. Older systems traditionally defined kernel entry points in the assembly language of the machine.

System Calls vs Library Functions#

memory allocation and garbage collection

  • malloc(3) function - implements one particular type of allocation; if we don’t like its operation then we can define our own function malloc which will probably use the system call sbrk

  • sbrk(2) system call - not a general-purpose memory manager; it increases or decreases the address space of the process by a specified number of bytes (how that space is managed is up to the process)

  • garbage collection techniques: best fit, first fit, etc.

datetime

  • some operating systems provide one system call for time and another system call for date, and any special handling such as the switch to or from daylight saving time is handled by the kernel or requires human intervention

  • UNIX provides a single system call that returns the number of seconds since the Epoch: 00:00:00 Jan 1, 1970, UTC; any interpretation of this value such as converting it to a human-readable datetime using the local time zone is left to the user process (the standard C library provides algorithms for daylight saving time)

process control

  • system calls: exec, fork, waitpid

  • functions: system, popen

Code#

/* myls.c
 *
 * print the name of every file in a directory
 * 
 * take an argument from the command line `argv[1]`
 * as the name of the directory to list
 */
#include "apue.h"
#include <dirent.h> /* function prototypes for `opendir`, `readdir`; definition of structure `dirent`
                     *
                     * we use `opendir`, `readdir`, `closedir` to manipulate the directory
                     * since the format of directory entries varies from one UNIX system to another
                     */

int main (int argc, char *argv[]) {
  DIR           *dp;
  struct dirent *dirp;

  if (argc != 2)
    err_quit("usage: ls directory_name");

  if ((dp = opendir(argv[1])) == NULL) /* function `opendir` returns a pointer to a structure `DIR`
                                        * which is passed to function `readdir`
                                        */
    err_sys("can't open %s", argv[1]);

  while ((dirp = readdir(dp)) != NULL) /* function `readdir` returns a pointer to a structure `dirent`
                                        * or a null pointer when it's finished with the directory
                                        */
    printf("%s\n", dirp->d_name);

  closedir(dp);
  exit(0);
}
/* stdin2stdout.c
 *
 * Copy STDIN to STDOUT.
 */

#include "apue.h"
#define BUFFSIZE 4096

int main (void) {
  int  n;
  char buf[BUFFSIZE];

  /* the constants `STDIN_FILENO` and `STDOUT_FILENO`
   * are part of the POSIX standard and
   * are defined in the header file `<unistd.h>`
   * they specify the file descriptors for STDIN (0) and STDOUT (1)
   */
  while ((n = read(STDIN_FILENO, buf, BUFFSIZE)) > 0) /* function `read` returns the number of bytes that are read
                                                       * and this value is used as the number of bytes to write;
                                                       * when the end of the input file is encountered,
                                                       * function `read` returns 0 and the program stops;
                                                       * if a read error occurs then function `read` returns -1
                                                       */
    if (write(STDOUT_FILENO, buf, n) != n)
      err_sys("write error");

  if (n < 0)
    err_sys("read error");

  exit(0);
}
cc stdin2stdout.c
./a.out > data             # STDIN is the terminal; STDOUT is redirected to the file `data`; STDERR is the terminal
./a.out < infile > outfile # the file named `infile` is copied to the file named `outfile`
/* Advanced Programming in the UNIX Environment, 3e
 *
 * stdin2stdout_buffered.c
 *
 * Copy STDIN to STDOUT using standard I/O.
 *
 * the header file `<stdio.h>` defines the following constants
 *   stdin
 *   stdout
 *   EOF
 *
 * function `getc` reads one character at a time
 * which is written by function `putc`;
 * after the last byte of input is read,
 * `getc` returns the constant `EOF`
 */

#include "apue.h"

int main (void) {
  int c;

  while ((c = getc(stdin)) != EOF)
    if (putc(c, stdout) == EOF)
      err_sys("output error");

  if (ferror(stdin))
    err_sys("input error");

  exit(0);
}
/* Advanced Programming in the UNIX Environment, 3e
 *
 * pid.c
 *
 * Print the PID of this program and exit.
 *
 * Details
 *   function `getpid` returns data type `pid_t`
 *
 *   we don't know its size: all we know is that
 *   the standard guarantees that it will fit in a long int
 *
 *   the value must be cast to the largest data type
 *   that it might use (in this case, a long int)
 *
 *   although most PIDs fit in an int,
 *   a long int promotes portability
 */

#include "apue.h"

int main (void) {
  printf("hello world from PID %1d\n", (long)getpid());
  exit(0);
}
cc pid.c
./a.out
hello world from PID 851
/* Advanced Programming in the UNIX Environment, 3e
 *
 * process_control.c
 * a demonstration of UNIX's process control functions
 *
 * Read commands from STDIN and execute them.
 * (This is a bare-bones implementation of a shell-like program.)
 *
 * Details
 *   standard I/O function `fgets` is used to read one line at a time from STDIN
 *
 *   when the end-of-file character (Ctrl-D) is typed as the first character of a line
 *   `fgets` returns a null pointer, the loop stops, and the process terminates
 *
 *   each line returned by `fgets` is terminated with a newline character followed by a null byte
 *   the standard C function `strlen` is used to calculate the length of the string
 *   and then replace the newline with a null byte
 *   because function `execlp` wants a null-terminated argument, not a newline-terminated argument
 *
 *   function `fork` is called to create a new process (called the child) which is a copy of the caller (called the parent)
 *   and returns the non-negative PID of the child process to the parent process
 *   and returns 0 to the child process
 *
 *   since `fork` creates a new process, we say that
 *   it is called once--by the parent--but returns twice--
 *   in the parent and in the child
 *
 *   `execlp` is called in the child to execute the command that was read from STDIN
 *   this replaces the child process with the new program file
 *
 *   on some operating systems, the combination of `fork` followed by `exec` is called
 *   "spawning a new process"
 *
 *   since the child calls `execlp` to execute the new program file
 *   the parent waits for the child to terminate
 *   by calling function `waitpid` and specifying which process to wait for
 *   via argument `pid`, the PID of the child
 *
 *   `waitpid` also returns the status of the child in variable `status`
 *   (we don't use it here)
 *
 *   to allow arguments would require that we parse the input line
 *   separating the arguments by some convention (e.g., by space or tabs)
 *   and then pass each arguments as a separate parameter to function `execlp`
 */

#include "apue.h"
#include <sys/wait.h>

int main (void) {
  char  buf[MAXLINE]; /* from apue.h */
  pid_t pid;
  int   status;

  printf("%% "); /* print prompt (printf requires %% to print %) */
  while (fgets(buf, MAXLINE, stdin) != NULL) {
    if (buf[strlen(buf) - 1] == '\n')
        buf[strlen(buf) - 1] = 0; /* replace newline with null */

    if ((pid = fork()) < 0) {
      err_sys("fork error");
    }
    else if (pid == 0) { /* child */
      execlp(buf, buf, (char *)0);
      err_ret("couldn't execute: %s", buf);
      exit(127);
    }

    /* parent */
    if ((pid = waitpid(pid, &status, 0)) < 0)
      err_sys("waitpid error");
    printf("%% ");
  }
  exit(0);
}
/* Advanced Programming in the UNIX Environment, 3e
 *
 * errmsg.c
 * a demonstration of the functions `strerror` and `perror`
 *
 * Details
 *   the name of the program `argv[0]` is passed as the argument to function `perror`
 *   this is a standard convention in UNIX
 *   by doing this, if the program is executed as part of a pipeline
 *   we are able to tell which of the three programs generated a particular error message
 */

#include "apue.h"
#include <errno.h>

int main (int argc, char *argv[]) {
  fprintf(stderr, "EACCES: %s\n", strerror(EACCES));
  errno = ENOENT;
  perror(argv[0]);
  exit(0);
}
cc errmsg.c
./a.out
EACCES: Permission denied
./a.out: No such file or directory
/* Advanced Programming in the UNIX Environment, 3e
 *
 * uid_gid.c
 * 
 * This program prints the UID and the GID.
 */

#include "apue.h"

int main (void) {
  printf("uid = %d, gid = %d\n", getuid(), getgid());
  exit(0);
}
cc uid_gid.c
./a.out
uid = 501, gid = 20
/* Advanced Programming in the UNIX Environment, 3e
 *
 * process_control_with_signal.c
 */

#include "apue.h"
#include <sys/wait.h>

static void sig_int(int); /* our signal-catching function */

int main (void) {
  char  buf[MAXLINE]; /* from apue.h */
  pid_t pid;
  int   status;

  if (signal(SIGINT, sig_int) == SIG_ERR) /* catch the signal `SIGINT` and call function `sig_int` */
    err_sys("signal error");

  printf("%% "); /* print prompt (printf requires %% to print %) */
  while (fgets(buf, MAXLINE, stdin) != NULL) {
    if (buf[strlen(buf) - 1] == '\n')
        buf[strlen(buf) - 1] = 0; /* replace newline with null */

    if ((pid = fork()) < 0) {
      err_sys("fork error");
    }
    else if (pid == 0) { /* child */
      execlp(buf, buf, (char *)0);
      err_ret("couldn't execute: %s", buf);
      exit(127);
    }

    /* parent */
    if ((pid = waitpid(pid, &status, 0)) < 0)
      err_sys("waitpid error");
    printf("%% ");
  }
  exit(0);
}

void sig_int (int signo) {
  printf("interrupt\n%% ");
}

Exercises#

1

Verify on your system that the directories dot and dot-dot are not the same, except in the root directory.

2

In the output from the following program, what happened to the processes with PIDs 852 and 853?

/* Advanced Programming in the UNIX Environment, 3e
 *
 * pid.c
 *
 * Print the PID of this program and exit.
 *
 * Details
 *   function `getpid` returns data type `pid_t`
 *
 *   we don't know its size: all we know is that
 *   the standard guarantees that it will fit in a long int
 *
 *   the value must be cast to the largest data type
 *   that it might use (in this case, a long int)
 *
 *   although most PIDs fit in an int,
 *   a long int promotes portability
 */

#include "apue.h"

int main (void) {
  printf("hello world from PID %1d\n", (long)getpid());
  exit(0);
}
cc pid.c
./a.out
hello world from PID 851
./a.out
hello world from PID 854

3

In the following program, the argument to perror is defined with the ISO C attribute const, whereas the integer argument to strerror isn’t defined with this attribute. Why?

4

If the calendar time is stored as a signed 32-bit integer, in which year will it overflow? How can we extend the overflow point? Are these strategies compatible with existing applications?

5

If the process time is stored as a signed 32-bit integer, and if the system counts 100 ticks per second, after how many days will the value overflow?


2 - UNIX Standardization and Implementations#


3 - File I/O#

functions

  • open

  • read

  • write

  • lseek

  • close

The effect of buffer size on functions read and write.

Unbuffered I/O means that each read or write invokes a system call. Unbuffered I/O functions are not part of ISO C but are part of POSIX.1 and the Single UNIX Spec.

The notion of an atomic op is important in the context of resource sharing among multiple processes.

  • arguments to function open

  • How are files shared among multiple processes? Which kernel data structures are involved?

functions

  • fcntl

  • sync

  • fsync

  • ioctl

FILE DESCRIPTOR

The kernel refers to open files by means of non-negative integers called file descriptors. When an existing file is opened or a new file is created, the kernel returns a file descriptor to the process. When we want to read or write a file, we identify the file with the file descriptor that was returned by open or creat as an argument to either read or write.

shell and application convention, not a feature of the UNIX kernel

  • 0 STDIN

  • 1 STDOUT

  • 2 STDERR

POSIX-compliant applications use the symbolic constants defined in header <unistd.h>

  • STDIN_FILENO

  • STDOUT_FILENO

  • STDERR_FILENO

fds range from 0 through OPEN_MAX-1

  • early implementations of UNIX had an upper limit of 19 allowing a maximum of 20 open files per process

  • subsequently, increased to 63

  • limit is limitless with FreeBSD 8.0, Linux 3.2.0, Mac OS X 10.6.8, Solaris 10; bounded only by amount of memory on the system, the size of an integer, and any hard and soft limits configured by the sysadmin

open, openat - open an existing file or create a new file

  • returns fd on okay, -1 on error

  • parameter path is the name of the file to open or create

  • the function has a multitude of options specified by argument oflag which is formed by ORing together with one or more of the following constants from header <fcntl.h>

    • one and only one must be specified

      • O_RDONLY open for reading only (defined as 0 for compatibility with older programs)

      • O_WRONLY open for writing only (defined as 1 for compatibility with older programs)

      • O_RDWR open for reading and writing (defined as 2 for compatibility with older programs)

      • O_EXEC open for execute only

      • O_SEARCH open for search only (applies to directories) - the purpose of this constant is to evaluate search permissions at the time a directory is opened; further ops using the directory’s file descriptor will not re-evaluate permission to search the directory

    • optional

      • O_APPEND - append to the end of file on each write

      • O_CLOEXEC - set the file descriptor flag FD_CLOEXEC

      • O_CREAT - create the file if it doesn’t exist; this option requires a third argument to function open or fourth argument to function openat called mode which specifies the access permission bits of the new file

      • O_DIRECTORY - generate an error is path doesn’t refer to a directory

      • O_EXCL - generate an error if O_CREAT is also specified and the file already exists; this test–whether a file already exists and the creation of the file if it doesn’t exist–is an atomic op

      • O_NOCTTY - if path refers to a terminal device, do not allocate the device as the controlling terminal for this process

      • O_NOFOLLOW - generate an error if path refers to a symbolic link

      • O_NONBLOCK - if path refers to a FIFO, a block special file, or a character special file, this option sets the nonblocking mode for both the opening of the file and subsequent I/O

      • O_SYNC - have each write wait for physical I/O to complete, including I/O necessary to update file attributes modified as a result of the write

      • O_TRUNC - if the file exists and if it is successfully opened for either write-only or read-write, truncate its length to 0

      • O_TTY_INIT - when opening a terminal device that is not already open, set the nonstandard termios parameters to values that result in behavior that conforms to the Single UNIX Spec

    • optional - part of the synchronized input and output option of the Single UNIX Spec & POSIX.1

      • O_DSYNC - have each write wait for physical I/O to complete but don’t wait for file attributes to be updated if they don’t affect the ability to read the data just written

      • O_RSYNC - have each read operation on the file descriptor wait until any pending writes for the same portion of the file are complete

  • ... is the ISO C way to specify that the number and types of the remaining arguments may vary

  • the last argument mode_t mode is used only when a new file is being created

  • parameter fd

    • parameter path specifies an absolute path name, parameter fd is ignored, and function openat behaves like function open

    • parameter path specifies a relative path name and parameter fd is a file descriptor that specifies the starting location in the file system where the relative path name is to be evaluated; parameter fd is obtained by opening the directory where the relative path name is to be evaluated

    • parameter path specifies a relative path name and parameter fd has the special value AT_FDCWD; the path name is evaluated starting in the current working directory and function openat behaves like function open

#include <fcntl.h>
int open   (        const char *path, int oflag, ... /* mode_t mode */ );
int openat (int fd, const char *path, int oflag, ... /* mode_t mode */ );

function openat is one of a class of functions introduced in POSIX.1 to address 2 problems

  1. Threads in the same process share the same current working directory which makes it difficult for multiple threads in the same process to work in different directories at the same time. Function openat gives threads a way to use relative path names to open files in directories other than the current working directory

  2. Function openat provides a way to avoid TOCTTOU errors. A program is vulnerable if it makes two file-based function calls where the second call depends on the results of the first call. Since the two calls are not atomic the file can change between the two calls thereby invalidating the results of the first call and leading to a program error.


POSIX interprocess communication

  • signal

    • SIGUSR1

    • SIGUSR2


/           # file system root
/etc/
/etc/passwd # passwd file - int UIDs <-> str login names (UID, GID, home directory, interactive login shell)
/etc/group  # group  file - int GIDs <-> str group names

OS

Kernel

Shell

FreeBSD

ash, dash

GNU

Linux

bash, dash

macOS

Darwin

bash, zsh

Solaris

Windows

cmd.exe

UNIX Programmer’s Manual

  1. Defines the general-purpose library functions available for programmers. These functions aren’t entry points into the kernel but they may invoke one or more of the kernel’s system calls (e.g., function printf may use system call write to output a string, but functions strcpy and atoi don’t involve the kernel at all).


Terms#

  • [ w ] 8-bit Computing

  • [ w ] 16-bit Computing

  • [ w ] 32-bit Computing

  • [ w ] 64-bit Computing

  • [ w ] 128-bit Computing

  • [ w ] Address Space

  • [ w ] Anonymous Pipe

  • [ w ] Attribute

  • [ w ] Background Process

  • [ w ] Barrier

  • [ w ] Basic Input/Output System (BIOS)

  • [ w ] Batch Processing

  • [ w ] Berkeley Sockets

  • [ w ] Booting

  • [ w ] Buffer

  • [ w ] Bus

  • [ w ] Busy-Waiting

  • [ w ] Calling Convention

  • [ w ] Child Process

  • [ w ] Concurrency Control

  • [ w ] Context Switch

  • [ w ] Control Flow

  • [ w ] CPU Mode

  • [ w ] Critical Section

  • [ w ] Daemon

  • [ w ] Data Corruption

  • [ w ] Deadlock

  • [ w ] Device Driver

  • [ w ] Device File

  • [ w ] Dining Philosophers Problem

  • [ w ] Directory

  • [ w ] Epoch

  • [ w ] Event

  • [ w ] Everything Is a File

  • [ w ] Exception Handling

  • [ w ] Execution

  • [ w ] Exit Status

  • [ w ] Extended File Attributes

  • [ w ] Fatal Exception Error

  • [ w ] Fatal System Error

  • [ w ] FIFO (Named Pipe)

  • [ w ] File

  • [ w ] File Descriptor

  • [ w ] File Locking

  • [ w ] File Path

  • [ w ] File Sharing

  • [ w ] File System

  • [ w ] File System Permissions

  • [ w ] File Type, Unix

  • [ w ] Fork-Join Model

  • [ w ] Garbage Collection (GC)

  • [ w ] Group Identifier (GID)

  • [ w ] Handle

  • [ w ] Handle Leak

  • [ w ] Hierarchical File System

  • [ w ] Input/Output (I/O)

  • [ w ] Installation

  • [ w ] Inter-Process Communication (IPC)

  • [ w ] Interrupt

  • [ w ] Job Control

  • [ w ] Job Scheduler

  • [ w ] Job Queue

  • [ w ] Kernel

  • [ w ] Kernel Space

  • [ w ] Library

  • [ w ] Linearizability

  • [ w ] Lock

  • [ w ] Logical Address

  • [ w ] Memory Address

  • [ w ] Memory Controller

  • [ w ] Memory Leak

  • [ w ] Memory Management Unit (MMU)

  • [ w ] Memory Page

  • [ w ] Memory Segmentation

  • [ w ] Message Queue

  • [ w ] Multiprocessing

  • [ w ] Multitasking

  • [ w ] Multithreading

  • [ w ] Mutex

  • [ w ] Mutual Exclusion

  • [ w ] Named Pipe (FIFO)

  • [ w ] Network Share

  • [ w ] Network Socket

  • [ w ] Operating System (OS)

  • [ w ] Page

  • [ w ] Parent Process

  • [ w ] passwd

  • [ w ] Path

  • [ w ] Physical Address

  • [ w ] Pipeline

  • [ w ] Pointer

  • [ w ] Portability

  • [ w ] Priority Inversion

  • [ w ] Process

  • [ w ] Process Calculus

  • [ w ] Process Group

  • [ w ] Process Identifier (PID)

  • [ w ] Process Isolation

  • [ w ] Producer-Consumer Model

  • [ w ] Program

  • [ w ] Protection Ring

  • [ w ] Queue

  • [ w ] Race Condition

  • [ w ] Readers-Writers Problem

  • [ w ] Real Time

  • [ w ] Reboot

  • [ w ] Resource

  • [ w ] Resource Acquisition is Initialization (RAII)

  • [ w ] Resource Allocation

  • [ w ] Resource Contention

  • [ w ] Resource Leak

  • [ w ] Resource Management

  • [ w ] Resource Starvation

  • [ w ] Return Statement

  • [ w ] Runtime

  • [ w ] Runtime Library

  • [ w ] Scheduling

  • [ w ] Segmentation Fault

  • [ w ] Semaphore

  • [ w ] Serializability

  • [ w ] setgid

  • [ w ] setuid

  • [ w ] Shared Memory

  • [ w ] Shared Resource

  • [ w ] Shell

  • [ w ] Signal

  • [ w ] Software Aging

  • [ w ] Software Bloat

  • [ w ] Spawn

  • [ w ] Spinlock

  • [ w ] Spinning

  • [ w ] Standard Stream

  • [ w ] Sticky Bit

  • [ w ] Stream

  • [ w ] Superuser

  • [ w ] Synchronization

  • [ w ] System Administrator

  • [ w ] System Call

    • [ w ] close

    • [ w ] exec

    • [ w ] exit

    • [ w ] fork

    • [ w ] open

    • [ w ] read

    • [ w ] wait

    • [ w ] write

  • [ w ] System Time

  • [ w ] Task

  • [ w ] Thread

  • [ w ] Thread Pool

  • [ w ] Time-Of-Check-To-Time-Of-Use (TOCTTOU)

  • [ w ] Unix File Type

  • [ w ] Unix Time

  • [ w ] User

  • [ w ] User Identifier (UID)

  • [ w ] User Space

  • [ w ] Utility Software

  • [ w ] Virtual Address Space

  • [ w ] Virtual Memory

  • [ w ] Virtual Page

  • [ w ] Wall Time

  • [ w ] Zombie Process

  • [ w ] Instruction Cycle

  • [ w ] Instruction Set Architecture (ISA)

  • [ w ] Ancient Unix

  • [ w ] Austin Group

  • [ w ] Berkeley Software Distribution (BSD)

    • [ w ] FreeBSD

    • [ w ] OpenBSD

  • [ w ] illumos

  • [ w ] macOS

    • [ w ] Darwin

    • [ w ] Mach

  • [ w ] The Open Group

  • [ w ] Plan 9 (Bell Labs)

  • [ w ] Portable Operating System Interface (POSIX)

  • [ w ] Research Unix

  • [ w ] Single UNIX Specification (SUS)

  • [ w ] Solaris (Oracle)

    • heritage in BSD and System V

  • [ w ] Sun Microsystems

    • [ w ] SunOS

  • [ w ] TENEX

  • [ w ] Unix-like

  • [ w ] Unix History

  • [ w ] UNIX System III

  • [ w ] UNIX System V

    • [ w ] System V/386 Release 3.2

    • [ w ] System V Release 4 (SVR4)

    • [ w ] HP’s HP-UX

    • [ w ] IBM’s Advanced Interactive eXecutive (AIX)

    • [ w ] STREAMS

  • [ w ] Unix Wars

shells

  • [ w ] Bourne Shell

    • Steve Bourne (Bell Labs)

    • control flow is similar to Algol 68

  • [ w ] Bourne-Again Shell

    • the GNU shell provided with all Linux systems

    • POSIX conformant & compatible with the Bourne shell

    • supports features from both C shell and Korn shell

  • [ w ] C Shell

    • Bill Joy (Berkeley)

    • comes with BSD

    • features that the Bourne shell lacked: job control, a history mechanism, command-line editing

  • [ w ] Command Prompt (cmd.exe)

  • [ w ] Debian Almquist Shell

    • Kenneth Almquist

    • the BSD replacement for the Bourne shell

  • [ w ] Fish Shell

  • [ w ] Korn Shell

    • David Korn (Bell Labs)

    • a successor to the Bourne shell

    • first came with SVR4

    • supports features from C shell

  • [ w ] shells

  • [ w ] TENEX C Shell

    • an enhanced version of the C shell

    • features such as command completion were borrowed from the TENEX operating system

    • standardized in the POSIX 1003.2 standard which included features based on those from Bourne shell and Korn shell

  • [ w ] Unix Shell

  • [ w ] Z Shell


Definitions#

Directory

  • “A directory is a file that contains directory entries [where] each directory entry [consists of] a file name along with…information describing the file’s attributes.” - Advanced Programming in the UNIX Environment, 3e

File Descriptor (fd)

  • “[A file descriptor is a] small non-negative integer that the kernel uses to identify the files accessed by a process. Whenever it opens an existing file or creates a new file, the kernel returns a file descriptor that we use when we want to read or write the file.” - Advanced Programming in the UNIX Environment, 3e

  • “A file descriptor is a number that the OS assigns to an open file to keep track of it; consider it a simplified file pointer, analogous to a file handle in C.”

File Name (fn)

  • “The names in a directory are called file names.” - Advanced Programming in the UNIX Environment, 3e

Operating System (OS)

  • “In a strict sense, an operating system can be defined as the software that controls the hardware resources of the computer and provides and environment under which programs can run. [This software is called the kernel] since it is relatively small and resides at the core of the environment…In a broad sense, an operating system consists of the kernel and all the other software [e.g., system utilities, libraries, shells, applications, etc.] that makes a computer useful and gives the computer its personality.” - Advanced Programming in the UNIX Environment, 3e

Path Name

  • “[A path name is formed by] a sequence of one or more file names…separated by slashes and optionally starting with a slash [and terminated by the null character]. A path name that begins with a slash is called an absolute path name; otherwise, it’s called a relative path name. Relative path names refer to files relative to the current directory. The name for the root of the file system / is a special-case absolute path name that has no file name component.” - Advanced Programming in the UNIX Environment, 3e

Process

  • “An executing instance of a program is called a process. Some operating systems use the term task to refer to a program that is being executed.” - Advanced Programming in the UNIX Environment, 3e

Program

  • “A program is an executable file residing on disk in a directory. A program is read into memory and is executed by the kernel as a result of one of the seven exec functions.” - Advanced Programming in the UNIX Environment, 3e

Shell

  • “A shell is a command-line interpreter that reads user input and executes commands. The user input to a shell is normally from the terminal (i.e., an interactive shell) or…from a file called a shell script.” - Advanced Programming in the UNIX Environment, 3e

System Call

  • “The interface to the kernel is a layer of software called the system calls.” - Advanced Programming in the UNIX Environment, 3e

Working Directory

  • “[The working directory] is the directory from which all relative path names are interpreted.” - Advanced Programming in the UNIX Environment, 3e