Outline
In this week’s lab you will:
- Learn about creating composite data types with structs in C
- Learn about optimizing data structure layout for modern memory systems
- Practice writing code for traversing data structures in C
Preparation
- Do an upstream pull to get the latest changes to the lab repository.
git fetch upstreamgit pull --no-ff --no-edit upstream master - Open the
lab10folder in VS Code.
Introduction
So far, we have only dealt with variables of a single data type. For example, we have written code to declare and manipulate integer and character variables. This tutorial will introduce grouping objects (or variables) of different data types into a single entity. Specifically, we will introduce structs in C. We will also practice writing code that involves structures.
Structs
The C struct declaration creates a data type that group variables of different types into a new type. The different components that make up the struct can be referenced via names. Similar to arrays, the different components of a structure are stored in a contiguous region in memory. A pointer to a struct is the address of its first byte. As an example, consider the following declaration of a new composite data type called record,
struct record {
int id;
char name[10];
};
The above declaration of a struct creates a new composite structure called record. The structure contains two fields or members: a 4-byte variable of type int, and a ten-element array of type char. Note that the character array is embedded within the structure. The byte offset of the first character from the start of the structure is 4 (an integer is 4 bytes).
The above struct declaration only defines a new type. No space is reserved in memory yet for the members of the record structure. In fact, no record even exists. We can define a new variable of type record and initialise its members as follows,
struct record r1 = {10, "Amanda"};
Once a structure is both declared and a variable of its type defined, we can reference (read or write) its members. We can use the dot operator (.) to reference members of a structure. If we have a pointer to a structure, we can reference its members with the structure pointer (->) operator. Consider the following example,
struct record r1;
r1.id = 10;
r1.name = "Amanda";
In the above example, we have used the dot operator to initialise the members of the struct r1. Now, consider the following code,
struct record r1 = {10, "Amanda"};
struct record *r2 = &r1;
r2->id = 17;
r2->name = "Helen";
We first define a record r1 and initialise its members (the first statement). We then define a pointer with the type record and assign it the address of r1. The r2 pointer contains the memory address of the first element of the r1 structure. Now, we can use the -> operator to access (and read/update) r1 members via the r2 pointer. To properly access the second member (name) of the r1 structure (the third statement), the compiler will add the appropriate offset (4 bytes) to the starting address of r1. We call this process translation from the symbolic reference of a struct member (r2->name) to the appropriate memory address.
When defining a structure such as r1 above, it is cumbersome to prefix struct record before the variable name. C provides a facility called typedef for creating new data type names for convenience. The following declaration,
typedef struct record new_record;
makes the name new_record a synonym for struct record. We can then define a new record variable as follows.
new_record r1 = {17, "Jack"};
When we pass structures to function as arguments, the whole structure (i.e., every member) is copied on the stack frame. It is more efficient to pass a pointer to the structure.
Open the program src/record.c and read the code. Try to guess the output of the program. Compile and run the program to check your guess.
Note that we have declared a new structure outside the main function. We have also declared several functions that we define later in the code (after the main function). Declaring the function header first before defining it in informs the compiler that we intend to define the function later (and use it before its definition in the code).
Open the program src/record.c and this time complete the definitions of the two functions equal and equal_with_copy. Both functions return 1 if the two structures passed as arguments are equal. Otherwise, they return 0. We define equality as follows: (1) the integer ids are equal, (2) the records contain similar names. Use your string utility functions from the previous lab or use an appropriate function from string.h. You can find the available functions in string.h via an online search.
Open the program src/buggy-record.c and read the code. There is a serious bug in the function create_record. Use your understanding of how functions operate at the assembly level and try to explain the bug. You can revisit call stack and stack frames from the lecture slides. Check with your tutor if you think you have figured it out.
Linked List
We will now use C structures to build a simple linked list data structure. A linked list is a data structure with a collection of elements or nodes and each node contains a pointer to the next element in the list. A linked list allows traversal over all its elements starting from the head node. To create a linked list of records, we first need to add a new member to the structure that points to the next element in the list. The revised declaration of the record structure is as follows,
struct record {
int id;
char name[10];
struct record *next;
};
The following code creates a linked list of three student records and connects them together into a linked list.
new_record student3 = {23, "Anson", NULL};
new_record student2 = {22, "Steve", &student3};
new_record student1 = {10, "Jack", &student2};
Note that student3 is the last element in the linked list and points to nothing. The NUll value is a built-in C constant that has a value of zero. In fact, it is a synonym for the character 0 used to terminate strings in C. Null can also be the value of a pointer, which is the same as zero.
Open the program src/linkedlist.c and compile and run the program. Observe carefully the definitions of the two functions to print the record and a linked list of records.
It is more convenient to have a function that adds a new record to the end of the linked list of records. So we can write code as follows:
new_record student1 = {10, "Jack", NULL};
new_record student2 = {22, "Steve", NULL};
new_record student3 = {23, "Anson", NULL};
insert(&student1, &student2);
insert(&student1, &student3);
Open the program src/linkedlist.c again and complete the function definition of the insert function. Test your function definition is correct by printing the linked list of records.
We would like to have the ability to delete records from the linked list given the values for the members of a record. Complete the definition of the delete_record function. The function takes the head node and integer id and name as arguments. If a record has matching id and name, it is deleted from the linked list. Test your code by printing the linked list.
Array of Structs
We can declare an array of structs the same way we can declare arrays of ordinary types. The following declares and defines an array of 100 records,
new_record records[100];
We can access element of the records array the same way we access elements of ordinary arrays. The following example assigns an id of 40 to the tenth element of the records array,
records[9].id = 40;
Alternatively, we can use pointer arithmetic and add the appropriate offset to the starting address of the array of records. Consider the following
(records + 9)->id = 40;
As discussed before, the name of an array of type T is in essence a pointer of type T. When we add 9 to a pointer of type new_record, the compiler under the hood multiplies 9 by the size of the structure to access the tenth element of the array. The -> operator then accesses the id of that element.
What do you expect the sizeof operator to return when applied on the new_record type? Write a small program to test your understanding.
Case Study: Struct of Arrays
We will now do a case study to gain a deep insight into how data structures interact with the memory hierarchy of modern compute systems. Consider the following structure that represents a point in a 3-dimensional space,
struct point3D {
int i;
int j;
int k;
};
// array of structs
struct point3D points[LEN];
Suppose we have an array of several 3D points (call the array points) and we want to sum all the points along the k dimension. Each point3D element of the points array is 12 bytes in size (4 bytes for i, 4 bytes for j, and 4 bytes for k). Recall that members of a structure are placed contiguously in memory. Recall also that elements of an array are also placed contiguously in memory. The k members of two 3D points, points[0] and points[1], are therefore 12 bytes apart in memory. In fact, each time the CPU accesses a point in the k dimension, it jumps to an address 12 bytes away from the previous access.
We would now like to discuss an aspect of the memory systems in modern CPUs. When designing the CPU in the first assignment, we asked you to connect the CPU directly to memory (or main memory). Unfortunately, this aspect of the QuAC CPU is a simplification. The main memory in modern computers is built using Dynamic Random Access Memory (DRAM) technology. Each DRAM cell uses a capacitor to store charge. Storing and retrieving charge to/from capacitors is slower than flip-flops or cross-coupled inverter-based Static Random Access Memory (SRAM). Recall the different memory technologies, and in particular, the discussion around cross-coupled inverters and flip-flops from the lectures. SRAM is expensive but fast and DRAM is cheap but slow. Therefore, modern computers use a memory hierarchy that consists of a small amount of fast SRAM and a large amount of slow DRAM.
On a memory read operation, the CPU first check the fast SRAM (called a cache) for data. If the data is not found in the cache, only then the CPU sends the request to slow main memory (DRAM). A cache hit means the CPU finds the requested data in the cache. A cache miss means the data is not found in the cache. Since DRAM accesses are slow, the CPU tries to be efficient and brings a lot more information than a word. Typically, the CPU loads 64 bytes from memory. This data from memory is then stored in the SRAM cache. The expectation is that if the CPU needs a word at an address A at a given point in time, then it will likely need the neighboring word at address A+1 soon afterward. Surprisingly, a lot of real-world programs exhibit this so-called spatial locality in program references.
The amount of data loaded from memory in each memory access is not related to the size of the CPU’s registers. Instead, it is the CPU’s cache line size, which is the amount of memory loaded in to each cache entry. For example, early Intel Pentium 3 processors were 32-bit processors (i.e. 32 bit registers and ALU), with 512kB caches using a 64 byte cache line size.
Now, let’s go back to the points array. Now that we have more knowledge about the underlying memory system of a modern computer, we can be smart about how we sum all the points along the k dimension in the 3D space. As a start, we can change the representation of our points in space. Consider the following declaration,
struct pointarray3D {
int i[LEN];
int j[LEN];
int k[LEN];
};
Instead of placing the points in the k dimension 12-bytes apart in the array of structs approach, we have placed all the points along the k dimension in the 3D space next to each other. This second representation above is called the struct of arrays approach. By placing the points next to each other, we expect the CPU to more frequently find the k points in the fast SRAM cache. The result is that the number of times the CPU needs to wait for slow main memory is reduced significantly. This way the program that uses the second approach runs faster than the one using the array of structs approach.
Open the programs src/aos.c and src/soa.c and complete the definitions of the sum function. Test your code for correctness. You can measure the time it takes for each program to finish using the time command. For example, if you type time ./program_name on the terminal, the time command reports the total time the program takes to run or execute. You should only look at the real component of the output.
scanf
Finally, we would like to introduce the scanf function that you will need for your second assignment. The scanf function receives formatted input from the terminal the same way printf prints formatted strings to the standard output. Consider the following C code for receiving the user name and integer identifier from the standard input or keyboard.
#include <stdio.h>
void main () {
int id;
char str1[20];
printf("Enter name: ");
scanf("%19s", str1);
printf("Enter your identifier: ");
scanf("%i", &id);
return;
}
Open the program src/scanf.c and read the code carefully to understand what the program is doing. Compile and run the program to make sure your understand how to use scanf in your C programs.