Outline#

In this week’s lab, you will:

  • Learn to fork and properly reap child processes
  • Learn to execute new programs in the context of an existing process
  • Learn how shells and other similar utilities are implemented
  • Learn how signal handlers are written
  • Practice writing and debugging (concurrent) systems programs

Preparation#

This lab uses a new repository, make sure you fork and clone it!

  1. Log in with your uni ID/password and open the lab template on GitLab here.

  2. Fork the repository by clicking on the fork button. This will create your own copy of the lab repository.

  3. Check the URL bar and make sure it contains your uid (eg. gitlab.cecs.anu.edu.au/*your-uid-here*/comp2310-2023-lab-pack-2). This means you are looking at your copy of the lab, and not the template. Once you’ve done this, click Clone and copy the address next to “Clone with HTTPS”.

  4. Open a terminal and clone with the Git clone command:
    git clone address_from_step_3_here
    

    Type in your uid and password when requested.

  5. You should now have a local copy of the repo on your computer. Try moving to the repo’s folder with the cd and ls commands from earlier.

Using non-Linux systems#

So far if you have been using Windows, you will have been able to do the labs fine, but from this point on you will be using system calls to interact with the operating system. The interface for Windows is unfortunately different from the Unix interface used by Linux and MacOS (which is what we’ll be using in these labs).

Fortunately, you can use the Windows Subsystem for Linux (WSL) to continue working from home. See detailed instructions here.

Alternatively you can use the VirtualBox VM.

Introduction#

In the previous tutorials you’ve practiced C code, as well as understood a bit on how that C code is compiled into the assembly language you learnt all about in COMP2300. Now we will teach you to write systems software, the cornerstone of systems programming. Systems programming involves writing low-level code that interfaces directly with the operating systems kernel and system libraries. In the coming tutorials, you will write and implement code for controlling processes and handling signals, and even write a memory allocator for manual memory management (which you have so far done via C library’s malloc()/free() functions.) Knowing the art of systems programming will make you a powerful programmer. With the skills gained, you can start hacking virtual machines, interpreters, editors, shell environments, web browser internals, etc. This specific tutorial deals with managing processes and handling signals.

Background#

The csapp.h header file#

In this tutorial, we provide you with the csapp.h header file, which contains error-handling wrappers for the system-level functions we have studied in the lectures. The wrappers redefine the vanilla system-level function with an uppercase. For example:

  • The wrapper for the fork() system call is Fork()
  • Use Wait() and Waitpid() instead of wait() and waitpid()
  • Use Execve() instead of execve
  • Signal() instead of signal()
  • Kill() instead of kill()
  • Sleep() instead of sleep()

To keep the code concise, you should always use the wrapper functions in this tutorial instead of the vanilla system call. You can consult the csapp.h header file when looking for a specific wrapper. The header file defines several other useful functions, e.g., unix_error().

If you need a refresher on what a header file is, see this section of the Lab 1 page.

Reading the man page#

There are a lot of details about system-level functions that we cannot cover in lectures or tutorial handouts. It would help if you were comfortable using the manual pages on Unix-like operating systems. As a start, try skimming the manual pages for the fork system call: man fork. Then, spend some time reading man execve.

On the man pages, you can also find valuable information about utility programs such as ls and grep.

Shell and terminal#

A shell is an interactive user application program that runs other programs on behalf of the user. Note that a terminal is an interface that provides a display for output and a keyboard for input to a shell session. Shell is the interpreter that executes commands typed as strings. Typically, shells can run a job in the foreground or background. While a foreground job runs, the shell waits for the job to finish before interpreting the following command the user types. The shell user can submit a background job by appending an & after the command. Below is a typical session with the shell environment:

shell-session

If you are new to working in a shell environment, there are several aspects of user interaction to note in the above session:

  • Some commands are built-in shell commands. In the above session, the examples are the echo and cd commands. The shell can also run any program for the user. For example, ls and forks are external programs. We need to inform the shell where a specific program resides by using the PATH environment variable. This has been done for ls, but not for forks. Therefore, to run forks, we prefix forks with ./ to tell the shell to look for forks in the current working directory. You can display environment variables such as PATH like so: $PATH.

  • When we type ./forks 1, the shell waits for forks to finish executing and only returns to read/interpret the next command after forks terminates.

  • When we run forks in the background by using an &, the shell returns immediately, printing some information about the child process it creates. You can verify that & indeed sends the job in the background by writing a program with a while (1) {;} loop, submitting it in the background, and then using ps to verify the program is still running. Try running the same program in the foreground without the &.

Foreground and background jobs#

In the second exercise, the very basic shell we implement allows the user to submit a job in the background with an &. Make sure you understand the difference between foreground and background jobs, as explained above.

ps#

Linux and most Unix-like operating systems provide a utility called ps for viewing information about processes on a system which is an abbreviation for Process Status. The ps command is used to list the currently running processes and their PIDs along with some other information depends on different options.

You will need to use ps for verifying your intuition about the existence of zombies in the first exercise. Please see man ps in the case you need to know more details. By default, simply typing ps shows information about the processes running in the current shell.

fork(), wait(), waitpid(), and execve()#

The lecture slides from the third week cover all these system-level functions in detail. Please refer to the slides in the case you need to brush up your understanding.

Here, we discuss execve() again for your convenience. The execve function loads and runs a new program in the context of the current process.

#include <unistd.h>

int execve(const char *filename, const char *argv[], const char *envp[]);

This function loads and runs the executable object file filename with the argument list argv and the environment variable list envp. Execve returns to the calling program only if there is an error (e.g., filename does not exist). To see how the argument list and the list of environment variables look like before and after the call to execve, check the lecture slides.

Note that execve belongs to the exec family of system calls. The v in execve comes from the fact that it takes an argument argv to the vector of arguments to the program (the same way the main function of a C program may take it). The e comes from its second set of arguments, envp, which is a vector of the environment variables (key/value pairs like LANG=english).

Signals#

A signal is a small message that notifies a process that an event of some type has occurred in the system. Each signal has a name and a number. For example, on Linux OS, SIGINT, SIGKILL, and SIGSEGV are famous signals with numbers 2, 9, and 11, respectively. Signals provide the kernel a mechanism to expose low-level hardware exceptions (e.g., divide by zero) to the user processes. Imagine a process attempts to divide by zero, then the kernel sends a SIGFPE signal (number 8) to the process. If a process makes an illegal memory reference, the kernel sends it a SIGSEGV.

If you type Ctrl+C (i.e., press the Ctrl key and the ‘c’ key at the same time), then the kernel sends a SIGINT (number 2) to each process in the foreground process.

A process can forcibly terminate another process by sending it a SIGKILL signal.

When a child process terminates or stops, the kernel sends a SIGCHLD signal to the parent.

Signal handlers#

The kernel delivers a signal to a process for two reasons:

  • The kernel has detected an event such as divide-by-zero error or a child process has terminated
  • A process has invoked the kill function to explicitly request the kernel to send a signal to the destination process.

There is a default action associated with each type of signal. A process can modify the default action associated with a signal by installing a signal handler. A process can install a signal handler by using the signal function. This function takes two arguments: the number of the signal and a signal handler associated with the signal. The prototype of the signal function is as follows.

#include <signal.h>

typedef void (*sighandler_t)(int);

sighandler_t signal(int signum, sighandler_t handler);

The function returns a pointer to the previous handler if everything goes well, and SIGERR on error (does not set errno).

For reasons of portability, you should always use the names of signals in your programs and not the numbers (you need to include signal.h). This is because the actual integer associated with a signal may differ between systems.

You can install a signal handler by calling the signal function and providing the signal number and a pointer to a function that returns void and take an integer argument. When a process catches a signal of type k, the handler installed for signal k is invoked with a single integer argument set to k. This argument allows the same handler function to catch different types of signals.

Sending signals with the kill function#

Processes send signals to other processes (including themselves) by calling the kill function.

#include <sys/types.h>
#include <signal.h>

int kill(pid_t pid, int sig);

If pid is greater than zero, then the kill function sends signal number sig to process pid.

Printing from signal handlers#

It is unsafe to call printf, sprintf, malloc, and several other system-level functions inside a signal handler. In technical terms these functions are not reentrant. This typically means these functions do not make use of local variables only. They maintain state across function calls (e.g., by declaring variables with the static storage duration).

You should use functions from the Safe I/O package provided as part of csapp.h for safely printing output on the screen from your signal handlers.

#include "csapp.h"

ssize_t sio_putl(long v);
ssize_t sio_puts(char s[]);

void sio_error(char s[]);

The sio_putl and sio_puts functions emit a long and a string, respectively, to standard output. The sio_error function prints an error message and terminates.

Exercise 1: Understanding fork(), wait, and waitpid()#

This exercise provides you with a source file, namely forks.c. The program implements various functions that use fork() and wait() in different ways. You can run the program by providing a number as an input argument. For example, to run the first function called fork1(), you can run ./forks 1.

Read the code for each function and then, without running, determine the function’s behaviour, including what will be printed on the screen. You will need to predict if the function will create any zombies. Verify your prediction using a combination of output on the screen and the ps utility. Terminate any zombie processes with the kill command.

Exercise 2: Properly reaping background jobs in shellx#

Programs such as shells in Unix-like operating systems and web servers make heavy use of fork and execve system calls. This exercise provides you with a basic template for building a shell. The shell program properly reaps foreground jobs. But it does not reap background jobs properly, leaving zombies in the system. Your task is to understand the program and then properly reap the background jobs by installing a signal handler.

Open the file shellx.c and try to understand the code, including key functions. You should have an intuition what the program is doing: reading and parsing user commands, and interpreting them one-by-one, and reaping foreground jobs properly.

In shellx.c, install a signal handler for the SIGCHLD signal. Every time the kernel sends the parent process the SIGCHLD signal, the parent should reap the defunct child process. Your signal handler should not exit the program, and return control to the main program.

Exercise 3: Debugging a signal handler for reaping multiple child processes#

In this exercise, we provide a program that uses signals to allow a parent process to count events where each event corresponds to a terminated child. The idea is that the parent process tracks the number of terminated children in a global variable. The program creates N child processes and sets a variable called ccount to N. Then, the signal handler decrements ccount each time it runs. If everything is correct, the parent process will eventually terminate. The parent will run forever if something is wrong with the program. Can you see what the program is doing? Can you verify that the parent indeed runs forever?

As background, we provide you with the following property regarding signals that are key to figuring out and fixing the bug. Pending signals of a specific type are not queued. The kernel maintains a pending bit vector containing exactly one bit for every signal type. Therefore, there can be at most one pending signal of a particular type. The existence of a pending signal merely indicates that at least one signal has arrived. If the signal handler is executing and two more signals are delivered to the destination process, then the kernel only keeps track of the fact that there is at least one pending signal of type k. It does not keep track of the number of signals of type k delivered while the signal handler runs.

Open the file child_handler.c. Identify and try to fix the bug in child_handler().

Volatile#

In the child_handler.c file you may have noticed the line

volatile int ccount = 0;

The volatile keyword in C is a a qualifier that is applied to a variable when it is declared. Its main purpose is to inform the compiler that the value of the variable may change at any time, even in the absence of any updates to a variable in the the code the compiler finds nearby.

So, compilers when generating code for the rest of the program sometimes cannot predict that the value of a variable can change by an interrupt handler or a signal handler. These handlers are not tightly coupled with the rest of the program like normal routines/functions.

Consider the following code somewhere in main():

int _exit = 0;

main() {
    while (!_exit) {
        /* tight spin loop which is completely visible to the compiler */
    }
    return;
}

The compiler can see that the loop body and the rest of the main() function does not touch (or write/update) _exit . Therefore, it is allowed to convert the loop to a while (true) loop. Even if the _exit variable is set to 1 by a signal handler for SIGINT, the compiler has no way to know that.

In this example, if the _exit variable is declared volatile, the compiler will generate assembly code that will load the most up-to-date value of the variable _exit from memory every time. Thus, the volatile qualifier tells the compiler that the specific variable it qualifies can be modified elsewhere.

Similarly, the compiler can notice that a variable never changes its value, and stores (caches) its value in a fast register (treating it as a constant) instead of reading it from memory. The signal handler could instead (not knowing the variable is stored in a register) update the variable directly in memory using the MOV instruction. This is also where we need volatile.

Note that the combination of your specific compiler and operating system may lead to different behaviour. Write a couple signal handlers as part of the lab exercises and let us know if you observe anything strange!

Exercise 4: Writing a snooze function#

The sleep function suspends a process for a specified period of time.

#include <unistd.h>

unsigned int sleep(unsigned int secs);

Sleep returns zero if the requested amount of time has elapsed, and the number of seconds still left to sleep otherwise. The latter case is possible if the sleep function returns prematurely because it was interrupted by a signal.

Write a wrapper function for sleep called wakeup, with the following interface:

unsigned int wakeup(unsigned int secs);

The wakeup function behaves exactly as the sleep function, except that it prints a message describing when the process actually woke up:

Woke up at 4 secs.

Now, write a program called snooze that takes a single command-line argument, calls the wakeup function with this argument, and then terminates. Write your program so that the user can interrupt the snooze function by typing Ctrl+C at the keyboard. For example:

shell> ./snooze 10
CTRL+C
Slept for 3 of 5 secs.
shell>

Exercise 5: Inter-Process Communication#

A process can kill another process using the kill() function (as explained in the background above). In this exercise, we ask you to kill the child processes running in a forever loop.

Open the file kill_children.c. You can see that the parent process creates children and then reaps them. Unfortunately, the children will keep executing forever. Your job is to use kill() to send a SIGINT signal to the child processes. Recall that by default on receipt of the SIGINT function the recipient process terminates. Once again, use the Kill() wrapper function from the provided header file.

Exercise 6: Understanding concurrency bugs#

It is challenging to manage concurrent flows that read and write the same program data. We need a way to synchronize the concurrent flows. Otherwise, we end up with interleaving of instructions (or program statements) that lead to the wrong outcome. In this exercise, we explore a subtle concurrency bug in a cooked-up shell program. We will study concurrency in detail in the second half of the semester. Nevertheless, here, we want you to identify the problem and make an attempt to fix the problem.

First, we provide some background on blocking and unblocking signals. By default, the kernel blocks any pending signals of the type currently being handled by a handler. More importantly, applications can explicitly block and unblock selected signals using the sigprocmask function and its helpers. The sigprocmask function changes the set of currently blocked signals by manipulating the blocked bit vector. Read the code below carefully to see how sigprocmask and its helper functions are used. We use the capitalized wrappers from csapp.h.

/* Create masks */
sigset_t mask, prev_mask;
/* Initialize the mask to empty set. None of the signals are in the set yet */
Sigemptyset(&mask);
/* Add the SIG_INT signal to the set */
Sigaddset(&mask, SIG_INT);
/* Block SIGINT and save previous blocked set */
Sigprocmask(SIG_BLOCK, &mask, &prev_mask);
/* Removes SIGINT from blocked */
Sigprocmask(SIG_UNBLOCK, &mask, &prev_mask);
/* Restores previous blocked set */
Sigprocmask(SIG_SETMASK, &prev_mask, NULL);

Finally, the Sigfillset function adds every signal to the mask.

Open the file procmask1.c. You will note that in main(), the parent process adds a child process to a list. After receiving a signal, the handler removes the child process from the list. In essence, the program keeps track of which processes are currently running. Note that we have intentionally kept the functions empty to focus on the interleaving of events that leads to a concurrency bug. Your job is to identify the sequence of events that may lead to an erroneous outcome. One erroneous outcome is that a child is removed from the list before it is added to the list by the parent. Once you have identified the problem, change the code to fix the concurrency bug.

bars search times arrow-up