Lab 6: Sanity Checking Implicit Free List Implementation

Outline#

This week’s lab is more of a drop-in for the previous lab. We will go over implementing the two allocators we introduced in Lab 5 (i.e. mmap-based allocator and the implicit free-list allocator) which you can then use as a starting point for your first assignment.

We have also added an extra guide on using GDB to debug memory bugs in your programs - something you will find invaluable when attempting the first assignment. This is available on this page in the resources section for you to read through. Note that to get the code in your lab pack 2 repository, you will need to upstream pull changes:

Add the original lab pack 2 repository you forked as an additional remote:

git remote add upstream https://gitlab.cecs.anu.edu.au/comp2310/2024/comp2310-2024-lab-pack-2.git

Fetch the changes made upstream:

git fetch upstream

Merge the changes to your fork of the repository:

git merge upstream/main

Introduction#

We assume you have read and even attempted Lab 5 before starting this lab. (We will not introduce the relevant topics again.) We strongly advise you to try Lab 5 before seeing the solutions in this lab. We will provide intermediate patches for each step and a complete solution at the end.

Basic setup#

Create an empty mmapalloc.c file in this repository. We will implement the mmap-based allocator in this file.

An mmap-based allocator#

We first create a mmapalloc.c file in the repository, containing all the necessary definitions for our implementation:

#include "mymalloc.h"

const size_t kMaxAllocationSize = 0;

void *my_malloc(size_t size) {
  return NULL;
}

void my_free(void *ptr) {
}

Allocating memory#

For this mmap allocator, if my_malloc is called to allocate \(N\) bytes, we directly call mmap to allocate a chunk with size no less than \(N\) bytes. Each mmap-ed chunk has a piece of meta-data memory for us to record the size of the chunk. So later when we call munmap, we will always know the size of the mmap-ed chunk. We want to have this meta-data as otherwise we may not know what the size of the allocation was.

We first define the Chunk data structure, and set an appropriate maximum allocation size we’re going to support, say, 16 MB. The Chunk data structure simply keeps track of the size of the allocation it points to.

--- a/mmapalloc.c
+++ b/mmapalloc.c
@@ -1,6 +1,14 @@
 #include "mymalloc.h"

-const size_t kMaxAllocationSize = 0;
+// Chunk data structure
+typedef struct {
+  size_t size;
+} Chunk;
+
+// Size of meta-data per Chunk
+const size_t kMetadataSize = sizeof(Chunk);
+// Maximum allocation size (16 MB)
+const size_t kMaxAllocationSize = (16ull << 20) - kMetadataSize;

 void *my_malloc(size_t size) {
   return NULL;

We now implement the my_malloc function by calling mmap. Since we will be storing the size of the Chunk in the first word of the allocation, we want to return the address of the data portion of our allocated memory. In order to facilitate this, we implement a simple helper function that will convert a Chunk to a void * to our data:

--- a/mmapalloc.c
+++ b/mmapalloc.c
@@ -1,3 +1,5 @@
+#include <sys/mman.h>
+
 #include "mymalloc.h"

 // Chunk data structure
@@ -10,8 +12,20 @@ const size_t kMetadataSize = sizeof(Chunk);
 // Maximum allocation size (16 MB)
 const size_t kMaxAllocationSize = (16ull << 20) - kMetadataSize;

+inline static void *get_data(Chunk *chunk) {
+  // We convert the Chunk address to a size_t and get the data address by
+  // adding the meta-data size
+  return (void *)(((size_t)chunk) + kMetadataSize);
+}
+
 void *my_malloc(size_t size) {
-  return NULL;
+  // Request memory from OS
+  Chunk *chunk = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
+  // Record allocation size
+  chunk->size = size;
+  // Return data pointer
+  LOG("alloc %p size=%zu\n", get_data(chunk), size);
+  return get_data(chunk);
 }

 void my_free(void *ptr) {

We also want to make sure that mmap hasn’t failed. If it has then we want to return NULL. From mmap’s man page, we can see that mmap returns a value of MAP_FAILED when mmap fails. Hence we can check for errors like so:

--- a/mmapalloc.c
+++ b/mmapalloc.c
@@ -21,6 +21,9 @@ inline static void *get_data(Chunk *chunk) {
 void *my_malloc(size_t size) {
   // Request memory from OS
   Chunk *chunk = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
+  if (chunk == MAP_FAILED) {
+    return NULL;
+  }
   // Record allocation size
   chunk->size = size;
   // Return data pointer

Metadata reservation and mmap size#

Note that our my_malloc implementation is not fully correct yet. We still have to fix two major issues, namely:

The chunk meta-data requires extra space in addition to the allocation size.
The size passed to mmap must be a multiple of page size.

To solve these issues, we first add the allocation size to the chunk meta-data size, and then round it up to a multiple of page size. On most operating systems such as Windows and Linux the page size is 4 KB, but if you are using an ARM Mac, then the page size is 16 KB. Here we assume you are using Linux so the page size is 4 KB, i.e. 4096 bytes:

--- a/mmapalloc.c
+++ b/mmapalloc.c
@@ -11,6 +11,13 @@ typedef struct {
 const size_t kMetadataSize = sizeof(Chunk);
 // Maximum allocation size (16 MB)
 const size_t kMaxAllocationSize = (16ull << 20) - kMetadataSize;
+// Size of a page (4 KB)
+static const size_t kPageSize = 1ull << 12;
+
+inline static size_t round_up(size_t size, size_t alignment) {
+  const size_t mask = alignment - 1;
+  return (size + mask) & ~mask;
+}

 inline static void *get_data(Chunk *chunk) {
   // We convert the Chunk address to a size_t and get the data address by
@@ -19,6 +26,8 @@ inline static void *get_data(Chunk *chunk) {
 }

 void *my_malloc(size_t size) {
+  // Round up size
+  size = round_up(size + kMetadataSize, kPageSize);
   // Request memory from OS
   Chunk *chunk = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
   if (chunk == MAP_FAILED) {

Deallocating memory#

Now that our my_malloc function is (more-or-less) complete, let us now tackle implementing the my_free function. Remember how we implemented my_malloc by converting a Chunk to its respective data pointer. For my_free we will perform the reverse operation, i.e. convert a data pointer to its Chunk. We create a helper function to facilitate this conversion as well:

--- a/mmapalloc.c
+++ b/mmapalloc.c
@@ -25,6 +25,12 @@ inline static void *get_data(Chunk *chunk) {
   return (void *)(((size_t)chunk) + kMetadataSize);
 }

+inline static Chunk *get_chunk(void *ptr) {
+  // We convert the data address to a size_t and get the Chunk address by
+  // subtracting the meta-data size
+  return (Chunk *)(((size_t)ptr) - kMetadataSize);
+}
+
 void *my_malloc(size_t size) {
   // Round up size
   size = round_up(size + kMetadataSize, kPageSize);
@@ -41,4 +47,9 @@ void *my_malloc(size_t size) {
 }

 void my_free(void *ptr) {
+  LOG("free %p\n", ptr);
+  // Get chunk
+  Chunk *chunk = get_chunk(ptr);
+  // Unmap memory
+  munmap(chunk, chunk->size);
 }

As with mmap, munmap can fail. We need to check the result of munmap and abort the execution of the program if we have encountered an error. Generally if we are writing programs, we want to gracefully handle errors as it allows the program to redo or retry certain actions, however, memory allocator libraries are different. Since such a library is so low-level, any error in the allocator will almost certainly result in catastrophic failure of the program using the allocator. Hence, it is best that if we find an error state in our allocator, we inform the program of the error and then abort.

--- a/mmapalloc.c
+++ b/mmapalloc.c
@@ -1,3 +1,7 @@
+#include <errno.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
 #include <sys/mman.h>

 #include "mymalloc.h"
@@ -51,5 +55,9 @@ void my_free(void *ptr) {
   // Get chunk
   Chunk *chunk = get_chunk(ptr);
   // Unmap memory
-  munmap(chunk, chunk->size);
+  int retval = munmap(chunk, chunk->size);
+  if (retval != 0) {
+    fprintf(stderr, "my_free: %s\n", strerror(errno));
+    abort();
+  }
 }

Corner cases#

To be comprehensive, we need to resolve a few corner cases:

For allocations with zero or greater than the maximum allocation size, my_malloc should return NULL. In such cases, the user of the my_malloc library must check they aren’t using a NULL pointer by accident.
Freeing a NULL pointer is a no-op.

--- a/mmapalloc.c
+++ b/mmapalloc.c
@@ -36,6 +36,7 @@ inline static Chunk *get_chunk(void *ptr) {
 }

 void *my_malloc(size_t size) {
+  if (size == 0 || size > kMaxAllocationSize) return NULL;
   // Round up size
   size = round_up(size + kMetadataSize, kPageSize);
   // Request memory from OS
@@ -52,6 +53,7 @@ void *my_malloc(size_t size) {

 void my_free(void *ptr) {
   LOG("free %p\n", ptr);
+  if (ptr == NULL) return;
   // Get chunk
   Chunk *chunk = get_chunk(ptr);
   // Unmap memory

All together#

The finished source code should look something like this:

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>

#include "mymalloc.h"

// Chunk data structure
typedef struct {
  size_t size;
} Chunk;

// Size of meta-data per Chunk
const size_t kMetadataSize = sizeof(Chunk);
// Maximum allocation size (16 MB)
const size_t kMaxAllocationSize = (16ull << 20) - kMetadataSize;
// Size of a page (4 KB)
static const size_t kPageSize = 1ull << 12;

inline static size_t round_up(size_t size, size_t alignment) {
  const size_t mask = alignment - 1;
  return (size + mask) & ~mask;
}

inline static void *get_data(Chunk *chunk) {
  // We convert the Chunk address to a size_t and get the data address by
  // adding the meta-data size
  return (void *)(((size_t)chunk) + kMetadataSize);
}

inline static Chunk *get_chunk(void *ptr) {
  // We convert the data address to a size_t and get the Chunk address by
  // subtracting the meta-data size
  return (Chunk *)(((size_t)ptr) - kMetadataSize);
}

void *my_malloc(size_t size) {
  if (size == 0 || size > kMaxAllocationSize) return NULL;
  // Round up size
  size = round_up(size + kMetadataSize, kPageSize);
  // Request memory from OS
  Chunk *chunk = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
  if (chunk == MAP_FAILED) {
    return NULL;
  }
  // Record allocation size
  chunk->size = size;
  // Return data pointer
  LOG("alloc %p size=%zu\n", get_data(chunk), size);
  return get_data(chunk);
}

void my_free(void *ptr) {
  LOG("free %p\n", ptr);
  if (ptr == NULL) return;
  // Get chunk
  Chunk *chunk = get_chunk(ptr);
  // Unmap memory
  int retval = munmap(chunk, chunk->size);
  if (retval != 0) {
    fprintf(stderr, "my_free: %s\n", strerror(errno));
    abort();
  }
}

Running tests#

After finishing the implementation of my_malloc and my_free, you need to use the testing script to run tests. Testing is required to check the correctness of your implementation. Since we named our file mmapalloc.c, we run the tests like so:

$ ./test.py -m mmapalloc

Other usages:

Run a single test: ./test.py -t align.
Run tests with release build: ./test.py --release.
Enable logging: ./test.py --log.
- Use LOG("Hello %s\n", "world"); for logging in your source code.
If your source file is called different_malloc.c instead of mymalloc.c: ./test.py -m different_malloc.

Simple free-list allocator#

We now implement a simple implicit free-list allocator.

Create an empty implementation file again and name it implicit_fl.c. Copy the empty implementation from the mmap allocator into this file.

Acquire initial memory#

To reduce the number of expensive system calls, we can get a large chunk of memory in one go and manage it on our own. Let us say we will acquire 64 MB of memory once from the operating system. We can then service allocation requests from the 64 MB chunk, and this chunk size is enough for most of our current tests.

Before we can acquire memory, we must think about how to represent a memory block. Note that a “block” here is just a name for the allocation primitive we will work with. That is to say, when we are asked to allocate an object in my_malloc, we will allocate a “block” that contains meta-data followed by the actual allocation. We define a Block to have two pieces of meta-data: (i) whether the Block is currently allocated or not; and (ii) what is the allocation size of this Block (including its meta-data size):

--- a/implicit_fl.c
+++ b/implicit_fl.c
@@ -1,6 +1,20 @@
+#include <stdbool.h>
+
 #include "mymalloc.h"

-const size_t kMaxAllocationSize = 0;
+typedef struct Block {
+  // Is the block allocated or not?
+  bool allocated;
+  // Size of the block (including meta-data size)
+  size_t size;
+} Block;
+
+// Size of meta-data per Block
+const size_t kBlockMetadataSize = sizeof(Block);
+// Maximum allocation size (16 MB)
+const size_t kMaxAllocationSize = (16ull << 20) - kBlockMetadataSize;
+// Memory size that is mmapped (64 MB)
+const size_t kMemorySize = (16ull << 22);

 void *my_malloc(size_t size) {
   return NULL;

We can now acquire memory from the OS using mmap if we are allocating for the first time:

--- a/implicit_fl.c
+++ b/implicit_fl.c
@@ -1,4 +1,5 @@
 #include <stdbool.h>
+#include <sys/mman.h>

 #include "mymalloc.h"

@@ -16,7 +17,25 @@ const size_t kMaxAllocationSize = (16ull << 20) - kBlockMetadataSize;
 // Memory size that is mmapped (64 MB)
 const size_t kMemorySize = (16ull << 22);

+// Starting address of our heap
+static Block *start = NULL;
+
+/// Acquire more memory from OS
+static Block *acquire_memory() {
+  // Acquire one more chunk from OS
+  Block *block = mmap(NULL, kMemorySize, PROT_READ | PROT_WRITE,
+                      MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
+  // Initialize block metadata
+  block->allocated = false;
+  block->size = kMemorySize;
+  return block;
+}
+
 void *my_malloc(size_t size) {
+  // Initial allocation?
+  if (start == NULL) {
+    start = acquire_memory();
+  }
   return NULL;
 }

Allocate memory from the global free list#

We use a global implicit free list to manage the memory blocks. Instead of adding explicit next/prev pointers to block meta-data, we use block start address and block size to traverse blocks inside a contiguous piece of memory and treat this as an implicit free list.

The only block meta-data we need to maintain is the free/allocated flag and the block size. For allocation, we traverse the list, find the first free block large enough, and split it into two blocks if necessary.

We fix the traversal order by always starting from the first block (i.e., the lowest address) and moving towards the last block (i.e., the highest address). To implement this traversal, we need to implement a function that will return the next block from a given block:

--- a/implicit_fl.c
+++ b/implicit_fl.c
@@ -20,6 +20,15 @@ const size_t kMemorySize = (16ull << 22);
 // Starting address of our heap
 static Block *start = NULL;

+/// Get right neighbour
+inline static Block *get_right_block(Block *block) {
+  size_t ptr = ((size_t)block) + block->size;
+  // Return NULL if we are outside the bounds of our heap
+  if (ptr >= ((size_t)start) + kMemorySize)
+    return NULL;
+  return (Block *)ptr;
+}
+
 /// Acquire more memory from OS
 static Block *acquire_memory() {
   // Acquire one more chunk from OS

Now that we have a function that can traverse through our free list, we need to find an algorithm for choosing an appropriate free block in which to place a newly allocated block. We will implement the first fit algorithm. First fit searches the free list from the beginning and chooses the first free block that fits. The idea is to traverse the free list until we find a Block large enough to fit our allocation. We may end up wasting memory if the chosen block is larger than the requested size. In that case, we may decide to split the block into two parts. We will worry about splitting later. If no Block can be found, we will return NULL. We will use helper functions again to abstract the conversion between a Block and data addresses:

--- a/implicit_fl.c
+++ b/implicit_fl.c
@@ -1,3 +1,4 @@
+#include <string.h>
 #include <stdbool.h>
 #include <sys/mman.h>

@@ -20,6 +21,11 @@ const size_t kMemorySize = (16ull << 22);
 // Starting address of our heap
 static Block *start = NULL;

+/// Get data pointer of a block
+inline static void *block_to_data(Block *block) {
+  return (void *)(((size_t)block) + kBlockMetadataSize);
+}
+
 /// Get right neighbour
 inline static Block *get_right_block(Block *block) {
   size_t ptr = ((size_t)block) + block->size;
@@ -45,7 +51,24 @@ void *my_malloc(size_t size) {
   if (start == NULL) {
     start = acquire_memory();
   }
-  return NULL;
+
+  // Find a block in the freelist
+  Block *block = NULL;
+  for (Block *b = start; b != NULL; b = get_right_block(b)) {
+    if (!b->allocated && b->size - kBlockMetadataSize >= size) {
+      block = b;
+      break;
+    }
+  }
+
+  // Mark block as allocated
+  block->allocated = true;
+  // Set block size
+  block->size = size + kBlockMetadataSize;
+  // Zero memory and return
+  void *data = block_to_data(block);
+  memset(data, 0, size);
+  return data;
 }

 void my_free(void *ptr) {

Before we implement my_free, let us make our my_malloc implementation more space efficient as our current implementation is not very space efficient. We first need to implement a function to split a given Block into two:

--- a/implicit_fl.c
+++ b/implicit_fl.c
@@ -1,3 +1,4 @@
+#include <assert.h>
 #include <string.h>
 #include <stdbool.h>
 #include <sys/mman.h>
@@ -35,6 +36,23 @@ inline static Block *get_right_block(Block *block) {
   return (Block *)ptr;
 }

+/// Split a block into two. Both of the blocks will be set as unallocated.
+/// Return block with size at least as big as `size`
+static Block *split_block(Block *block, size_t size) {
+  // We should only split unallocated blocks
+  assert(!block->allocated);
+  size_t total_size = block->size;
+
+  Block *first = block;
+  first->allocated = false;
+  first->size = total_size - size - kBlockMetadataSize;
+
+  Block *second = get_right_block(first);
+  second->size = total_size - first->size;
+  second->allocated = false;
+  return second;
+}
+
 /// Acquire more memory from OS
 static Block *acquire_memory() {
   // Acquire one more chunk from OS

We now use this split_block function to implement my_malloc by splitting the block we find with our first-fit algorithm if it is too large:

--- a/implicit_fl.c
+++ b/implicit_fl.c
@@ -14,6 +14,8 @@ typedef struct Block {

 // Size of meta-data per Block
 const size_t kBlockMetadataSize = sizeof(Block);
+// Minimum allocation size (1 B)
+const size_t kMinAllocationSize = 1;
 // Maximum allocation size (16 MB)
 const size_t kMaxAllocationSize = (16ull << 20) - kBlockMetadataSize;
 // Memory size that is mmapped (64 MB)
@@ -79,10 +81,18 @@ void *my_malloc(size_t size) {
     }
   }

-  // Mark block as allocated
+  // Split block if the block is too large. Our splitting heuristic will split
+  // a block if its size >= requested size + 2 * meta-data size + minimum allocation size
+  if (block->size >= size + (kBlockMetadataSize << 1) + kMinAllocationSize) {
+    Block *second = split_block(block, size);
+    Block *first = block;
+    first->allocated = false;
+    block = second;
+  }
+
+  // Mark block as allocated; We don't have to set the size of the block
+  // anymore as the `split_block` function will set the size
   block->allocated = true;
-  // Set block size
-  block->size = size + kBlockMetadataSize;
   // Zero memory and return
   void *data = block_to_data(block);
   memset(data, 0, size);

There is a subtle bug in our my_malloc implementation: if we can’t find a free block to allocate into, we will get a SIGSEGV as we will dereference a NULL pointer! If we can’t find a free block, then that means that we have exhausted all our memory. Let’s fix this bug by setting the errno to out of memory and returning NULL to the program:

--- a/implicit_fl.c
+++ b/implicit_fl.c
@@ -1,3 +1,4 @@
+#include <errno.h>
 #include <assert.h>
 #include <string.h>
 #include <stdbool.h>
@@ -81,6 +82,12 @@ void *my_malloc(size_t size) {
     }
   }

+  // We failed to find a free block. Return NULL
+  if (block == NULL) {
+    errno = ENOMEM;
+    return NULL;
+  }
+
   // Split block if the block is too large. Our splitting heuristic will split
   // a block if its size >= requested size + 2 * meta-data size + minimum allocation size
   if (block->size >= size + (kBlockMetadataSize << 1) + kMinAllocationSize) {

Release memory to the global free list#

To release a memory block, we get the block start address from the given pointer, and mark the block as free:

--- a/implicit_fl.c
+++ b/implicit_fl.c
@@ -30,6 +30,11 @@ inline static void *block_to_data(Block *block) {
   return (void *)(((size_t)block) + kBlockMetadataSize);
 }

+/// Get the block of a data pointer
+inline static Block *data_to_block(void *ptr) {
+  return (Block *)(((size_t)ptr) - kBlockMetadataSize);
+}
+
 /// Get right neighbour
 inline static Block *get_right_block(Block *block) {
   size_t ptr = ((size_t)block) + block->size;
@@ -107,4 +112,8 @@ void *my_malloc(size_t size) {
 }

 void my_free(void *ptr) {
+  // Get block pointer
+  Block *block = data_to_block(ptr);
+  // Mark block as free
+  block->allocated = false;
 }

Alignment#

If you run all the tests now using ./test.py, you’ll see one important test (align) failing. We see that it failed with a message “align: tests/align.c:12: int main(): Assertion ‘is_word_aligned(ptr)’ failed.”. Checking tests/align.c line 12, we can see that the first allocation void *ptr = mallocing(1); may not return a word-aligned address. The issue lies in the fact that we are not aligning our size to the next word. We can fix this by rounding up our allocation size correctly (note how we had to change the minimum allocation size as well):

--- a/implicit_fl.c
+++ b/implicit_fl.c
@@ -13,10 +13,12 @@ typedef struct Block {
   size_t size;
 } Block;

+// Word alignment
+const size_t kAlignment = sizeof(size_t);
 // Size of meta-data per Block
 const size_t kBlockMetadataSize = sizeof(Block);
-// Minimum allocation size (1 B)
-const size_t kMinAllocationSize = 1;
+// Minimum allocation size (1 word)
+const size_t kMinAllocationSize = kAlignment;
 // Maximum allocation size (16 MB)
 const size_t kMaxAllocationSize = (16ull << 20) - kBlockMetadataSize;
 // Memory size that is mmapped (64 MB)
@@ -25,6 +27,12 @@ const size_t kMemorySize = (16ull << 22);
 // Starting address of our heap
 static Block *start = NULL;

+/// Round up size
+inline static size_t round_up(size_t size, size_t alignment) {
+  const size_t mask = alignment - 1;
+  return (size + mask) & ~mask;
+}
+
 /// Get data pointer of a block
 inline static void *block_to_data(Block *block) {
   return (void *)(((size_t)block) + kBlockMetadataSize);
@@ -73,6 +81,8 @@ static Block *acquire_memory() {
 }

 void *my_malloc(size_t size) {
+  // Round up allocation size
+  size = round_up(size + kBlockMetadataSize, kAlignment);
   // Initial allocation?
   if (start == NULL) {
     start = acquire_memory();
@@ -81,7 +91,7 @@ void *my_malloc(size_t size) {
   // Find a block in the freelist
   Block *block = NULL;
   for (Block *b = start; b != NULL; b = get_right_block(b)) {
-    if (!b->allocated && b->size - kBlockMetadataSize >= size) {
+    if (!b->allocated && b->size >= size) {
       block = b;
       break;
     }

Handling corner cases (again)#

We now have a mostly correct implementation of an implicit free-list allocator. We just have to handle a few corner cases again just like for the mmap allocator:

--- a/implicit_fl.c
+++ b/implicit_fl.c
@@ -81,6 +81,7 @@ static Block *acquire_memory() {
 }

 void *my_malloc(size_t size) {
+  if (size == 0 || size > kMaxAllocationSize) return NULL;
   // Round up allocation size
   size = round_up(size + kBlockMetadataSize, kAlignment);
   // Initial allocation?
@@ -122,6 +123,7 @@ void *my_malloc(size_t size) {
 }

 void my_free(void *ptr) {
+  if (ptr == NULL) return;
   // Get block pointer
   Block *block = data_to_block(ptr);
   // Mark block as free

Consider the case where we provide a random pointer to our my_free implementation. What will happen in our current implementation? Currently it might crash after converting the data pointer to a Block pointer. Hence, we need to be careful while dealing with random data addresses. We can fix this bug by checking if the Block is within the range of our heap:

--- a/implicit_fl.c
+++ b/implicit_fl.c
@@ -1,5 +1,7 @@
 #include <errno.h>
+#include <stdio.h>
 #include <assert.h>
+#include <stdlib.h>
 #include <string.h>
 #include <stdbool.h>
 #include <sys/mman.h>
@@ -52,6 +54,23 @@ inline static Block *get_right_block(Block *block) {
   return (Block *)ptr;
 }

+/// Check if Block pointed by `block` is in our mmaped memory
+static bool in_mmaped_memory(Block *block) {
+  size_t block_sz = (size_t) block;
+  size_t start_sz = (size_t) start;
+  size_t end_sz = start_sz + kMemorySize;
+  if (start == NULL)
+    // if we haven't mmaped anything then it is not in our memory
+    return false;
+  if (block_sz < start_sz)
+    // if the block is before our start then it is not in our memory
+    return false;
+  if (block_sz > end_sz)
+    // if the block is after our end then it is not in our memory
+    return false;
+  return true;
+}
+
 /// Split a block into two. Both of the blocks will be set as unallocated.
 /// Return block with size at least as big as `size`
 static Block *split_block(Block *block, size_t size) {
@@ -126,6 +145,11 @@ void my_free(void *ptr) {
   if (ptr == NULL) return;
   // Get block pointer
   Block *block = data_to_block(ptr);
+  if (!in_mmaped_memory(block) || !block->allocated) {
+    errno = EINVAL;
+    fprintf(stderr, "my_free: %s\n", strerror(errno));
+    abort();
+  }
   // Mark block as free
   block->allocated = false;
 }

All together#

The finished source code should look something like this:

#include <errno.h>
#include <stdio.h>
#include <assert.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>
#include <sys/mman.h>

#include "mymalloc.h"

typedef struct Block {
  // Is the block allocated or not?
  bool allocated;
  // Size of the block (including meta-data size)
  size_t size;
} Block;

// Word alignment
const size_t kAlignment = sizeof(size_t);
// Size of meta-data per Block
const size_t kBlockMetadataSize = sizeof(Block);
// Minimum allocation size (1 word)
const size_t kMinAllocationSize = kAlignment;
// Maximum allocation size (16 MB)
const size_t kMaxAllocationSize = (16ull << 20) - kBlockMetadataSize;
// Memory size that is mmapped (64 MB)
const size_t kMemorySize = (16ull << 22);

// Starting address of our heap
static Block *start = NULL;

/// Round up size
inline static size_t round_up(size_t size, size_t alignment) {
  const size_t mask = alignment - 1;
  return (size + mask) & ~mask;
}

/// Get data pointer of a block
inline static void *block_to_data(Block *block) {
  return (void *)(((size_t)block) + kBlockMetadataSize);
}

/// Get the block of a data pointer
inline static Block *data_to_block(void *ptr) {
  return (Block *)(((size_t)ptr) - kBlockMetadataSize);
}

/// Get right neighbour
inline static Block *get_right_block(Block *block) {
  size_t ptr = ((size_t)block) + block->size;
  // Return NULL if we are outside the bounds of our heap
  if (ptr >= ((size_t)start) + kMemorySize)
    return NULL;
  return (Block *)ptr;
}

/// Check if Block pointed by `block` is in our mmaped memory
static bool in_mmaped_memory(Block *block) {
  size_t block_sz = (size_t) block;
  size_t start_sz = (size_t) start;
  size_t end_sz = start_sz + kMemorySize;
  if (start == NULL)
    // if we haven't mmaped anything then it is not in our memory
    return false;
  if (block_sz < start_sz)
    // if the block is before our start then it is not in our memory
    return false;
  if (block_sz > end_sz)
    // if the block is after our end then it is not in our memory
    return false;
  return true;
}

/// Split a block into two. Both of the blocks will be set as unallocated.
/// Return block with size at least as big as `size`
static Block *split_block(Block *block, size_t size) {
  // We should only split unallocated blocks
  assert(!block->allocated);
  size_t total_size = block->size;

  Block *first = block;
  first->allocated = false;
  first->size = total_size - size - kBlockMetadataSize;

  Block *second = get_right_block(first);
  second->size = total_size - first->size;
  second->allocated = false;
  return second;
}

/// Acquire more memory from OS
static Block *acquire_memory() {
  // Acquire one more chunk from OS
  Block *block = mmap(NULL, kMemorySize, PROT_READ | PROT_WRITE,
                      MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
  // Initialize block metadata
  block->allocated = false;
  block->size = kMemorySize;
  return block;
}

void *my_malloc(size_t size) {
  if (size == 0 || size > kMaxAllocationSize) return NULL;
  // Round up allocation size
  size = round_up(size + kBlockMetadataSize, kAlignment);
  // Initial allocation?
  if (start == NULL) {
    start = acquire_memory();
  }

  // Find a block in the freelist
  Block *block = NULL;
  for (Block *b = start; b != NULL; b = get_right_block(b)) {
    if (!b->allocated && b->size >= size) {
      block = b;
      break;
    }
  }

  // We failed to find a free block. Return NULL
  if (block == NULL) {
    errno = ENOMEM;
    return NULL;
  }

  // Split block if the block is too large. Our splitting heuristic will split
  // a block if its size >= requested size + 2 * meta-data size + minimum allocation size
  if (block->size >= size + (kBlockMetadataSize << 1) + kMinAllocationSize) {
    Block *second = split_block(block, size);
    Block *first = block;
    first->allocated = false;
    block = second;
  }

  // Mark block as allocated; We don't have to set the size of the block
  // anymore as the `split_block` function will set the size
  block->allocated = true;
  // Zero memory and return
  void *data = block_to_data(block);
  memset(data, 0, size);
  return data;
}

void my_free(void *ptr) {
  if (ptr == NULL) return;
  // Get block pointer
  Block *block = data_to_block(ptr);
  if (!in_mmaped_memory(block) || !block->allocated) {
    errno = EINVAL;
    fprintf(stderr, "my_free: %s\n", strerror(errno));
    abort();
  }
  // Mark block as free
  block->allocated = false;
}

Running tests#

After finishing the my_malloc and my_free implementations, rerun the tests to see if you are passing them or not. Note that all tests passing does not necessarily mean that your implementation is correct! Try making some small tests that use your my_malloc and my_free functions to allocate memory dynamically. Try to think of edge cases and other conditions that may break the assumptions of your implementation.