Lab 10: Concurrency 2

Outline#

In this week’s lab, you will:

Learn three approaches to writing concurrent server applications
Learn to use select() for I/O multiplexing
Write an event-based server that responds to three different types of clients concurrently
Learn the basics of threads and their use in writing concurrent applications

Preparation#

Do an upstream pull to get the lab10 directory containing all the necessary files into your clone of the lab pack. Run the following commands in your local clone of the lab pack 4 repository:

git remote add upstream https://gitlab.cecs.anu.edu.au/comp2310/2024/comp2310-2024-lab-pack-4.git
git pull upstream
git rebase upstream/main

Introduction#

We previously introduced the echo and tiny web servers in Lab 8. These were both examples of sequential web servers that can only handle one client at a time. When a sequential server is communicating with one client, other clients will need to wait for that client to disconnect before they can be served.

In this lab, we will explore different approaches toward making client-server web applications respond to multiple clients concurrently. The lectures, introduced three approaches for writing concurrent server applications:

Process-based: this uses fork to create a child process to handle each new client.
Thread-based: this is similar to the process-based approach, but it uses the light-weight abstraction of a thread rather than a process.
Event-based: this uses a programming pattern called I/O multiplexing via the select system call to achieve concurrency.

Process vs. Thread#

You were introduced to pthreads in the Week 8 Lectures/previous lab, and processes in the Week 3 Lectures/Lab 4.

As a brief summary of the difference between processes and threads, a thread is a light-weight abstraction of a logical flow. More specifically, threads share the address space of a single process, while each has their own PC and stack frame. This means that creating and context switching between threads result in a lower overhead compared to processes. Threads are managed by the OS.

Exercise 1: Process-Based Echo Server#

For this first exercise, you will not have to write any code! Instead, your task is to read the provided code for a process-based concurrent echo server and make sure you understand how it works. The process-based server uses fork() to create a client (child) process to handle each new client. Pay particular attention to where each file descriptor is closed in both the parent and child processes.

To compile the process-based server, run make echoserverp in the lab10 directory. Run make echoclient to build the echo client program, and run in the same way as you did in Lab 8.

Run the process based echo server locally. Try and connect to it from multiple instances of the echoclient launched from different terminals simultaneously. Can you leave one client hanging and still get a response on another?

Discuss the following questions with either your fellow classmates or your tutor to check your understanding of the process-based echo server.

Why must the parent process close the connected descriptor?
What are the consequences of child process forgetting to close listenfd?

Exercise 2: Thread-Based Echo Server#

You have also been provided with unfinished code for a thread-based echo server in the echoservert.c file. Your second exercise in this lab is to complete this code so that you have a working thread-based server. You are given the following code to accept client connections in the main function:

while (1) {
  clientlen = sizeof(struct sockaddr_storage);
  connfdp = malloc(sizeof(int));
  if ((*connfdp = accept(listenfd, (SA *)&clientaddr, &clientlen)) < 0) {
    perror("accept");
  }
  /* ... */
}

And the following thread routine:

/* Thread routine */
void *thread(void *vargp) {
  int connfd = *((int *)vargp);
  pthread_detach(pthread_self());
  /* TODO: finish this function! */
  return NULL;
}

The pthread_detach function is used to indicate to the implementation that storage for the thread thread can be reclaimed when the thread terminates. This means that the thread does not explicitly need to have it’s resources cleaned up by calling pthread_join.

Why would we not want our main thread to call pthread_join on the threads we create in the main function?

Your task is to complete the code given.

In echoservert.c, you will need to:

Modify main to call pthread_create with the correct arguments such that a new thread running thread is created with connfdp passed in as its argument.
Fill in the rest of the thread function so that it handles the entire echo client connection, including freeing the memory used by connfdp when it is no longer needed.

Test your implementation by compiling echoservert, and attempt to connect to it with multiple clients at once.

Exercise 3: Improving the Thread-Based Echo Server#

The remaining exercises for this lab are included with very little explanation of how to attempt them. In this particular case, the idea behind creating a thread-pool for a concurrent echo server was covered in the Week 9 Lectures.

It isn’t very efficient to create a new thread for every new client connection.

A better approach is to create a set number of threads once during program startup, each of which wait for a main thread to assign them a client connection to handle.

Modify echoservert.c to use a thread pool for managing client connections. You will want to use the shared buffer code you wrote in the two extension tasks for exercise 3 of the previous lab to distribute file descriptors of connected clients.

Exercise 4: Event Based Server (Extension Task)#

The third approach to writing concurrent web servers is to use a programming pattern called I/O multiplexing. Specifically, it uses the select() system call for I/O multiplexing. You can read more about the select() system call here.

Select Demo#

You are provided with a select.c demo program. The program adds two descriptors to the read set. It then calls select that suspends the program. When it wakes up, it checks to see if any or both descriptors in the ready set are set to 1 (read and ready sets are bit vectors). If listenfd is set, the program accepts the client and calls the echo() function. Otherwise, the program calls the command() function.

Let’s answer a few questions to check your understanding of select.c.

Is the select.c program an example of concurrent application? Explain the sources of concurrency in select.c.
Notice that we assign read_set to ready_set at the start of the iterative while loop. Why is that?
If select.c is run on a multicore processor, will we observe parallelism during execution?
Can you change the definition of the echo() function such that the program responds to keyboard input (pending or otherwise) as soon as possible rather than waiting for the connected descriptor to close?
Why do we use listenfd+1 in the call to select?

Select-Based Echo Server#

Your final task is to first look at and understand the select-based server provided in echoservers.c, then heed the following advice from the man page for select which says:

WARNING: select() can monitor only file descriptors numbers that
are less than FD_SETSIZE (1024)—an unreasonably low limit for
many modern applications—and this limitation will not change.
All modern applications should instead use poll(2) or epoll(7),
which do not suffer this limitation.

Implement an event-based concurrent web server using epoll rather than select.

Conclusion#

If you’ve reached this point, you have now completed all the lab content for COMP2310/6310. Excellent work ✨🎉! You may now want to spend the remainder of your time either working on your second assignment or completing some of the practice exams available to you.