Task 1: Quantitative Analysis#

In this task, you will quantitatively compare syscall I/O to memory-mapped file I/O for small I/O accesses (up to 64 bytes). Specifically, we ask you to write a simple benchmarking framework to do a large number of small read/write file I/O operations on a file using the following techniques. We leave it to you to create the file and write the C code from groud up.

  • Use native read()/write() system calls to perform a number of I/O operations on a file
  • Use standard C library fread() and fwrite() functions to perform I/O operations
  • Use the robust I/O (RIO) package from the course website to perform I/O operations (use the appropriate read/write functions from csapp.h)
  • Use memory-mapped file I/O (MMIO) to perform I/O operations (use pointer arithmetic where necessary)

Your benchmark code should use #define to create constants for: (1) number of read and write operations and (2) size of each I/O operation

Recall that the standard C library and the robust I/O package maintain an internal buffer to reduce the number of system calls. They copy the requested data (mostly) from the internal buffer to the user-provided buffer. This copy is avoidable when using native read()/write() system calls. In both cases, the OS kernel does its own buffering in the page cache to amortize the high cost of disk transfers.

Recall that mapping a disk file into virtual memory with mmap() returns a pointer that the programmer can use for accessing the file. The first access to the file (e.g., read operation using pointer) results in a transfer of an entire 4 KB (page) from disk drive to main memory. Subsequent accesses results in no transfer and with MMIO there is no copying operations like the ones in syscall I/O. The program has direct access to the file contents transferred by the kernel into main memory.

Bottomline: MMIO uses the virtual memory abstraction and relies on page faulting for disk to memory transfers. Its advantage is direct (pointer) access to file contents. On the other hand, syscall I/O results in one (at least) and up to two memory copies in addition to disk to memory transfers. How these tradeoffs play out in a real system depends on the underlying architecture and program’s I/O behavior.

You should use the Linux shell’s time command to measure your benchmark’s execution times. You should measure the execution time for each of the above I/O techniques.

Please send an email to shoaib.akram@anu.edu.au with your results and analysis.

Task 2: Understanding functions in the robust I/O package#

Read the code for the following functions from the RIO package, and try to comprehend the implementation of the following functions.

  • static ssize_t rio_read(rio_t *rp, char *usrbuf, size_t n)
  • ssize_t rio_readnb(rio_t *rp, void *usrbuf, size_t n)
  • ssize_t rio_readlineb(rio_t *rp, void *usrbuf, size_t maxlen)

Task 3: Understanding additional flags for opening files#

Use the man pages to understand the meaning and use of the O_DIRECT and O_NONBLOCK flags during file creation.

Task 4: Understanding fflush() and fsync()#

Use Google and man pages to understand the need and use of (1) C library function fflush() and (2) system-level function fsync().

bars search times arrow-up