tags : Algorithms, Systems, Filesystems

Copy on Write

  • It’s essentially a programming technique which is applicable in many contexts.
  • Filesystems
    • CoW as an alternative way to do journaling (brtfs uses CoW vs journaling in ext4)
    • You work on a copy. After you are done, the copy is then made the real file.
  • Memory Management
    • When new process is created, it’ll just point the PTE to the resource, only when it’s written to it’ll make a copy of it to an actual physical address. See mmap.

Zero Copy

  • This is different from CoW , but I decided to put them in the same page anyway.

The case of sendfile

  • Optimizing Large File Transfers in Linux with Go
  • Sendfile (a system call for web developers to know about!)
  • Linux kernel supports few syscalls with zero copy. Eg. sendfile (wrapper around splice)
  • It avoids kernel-userspace-kernel roundtrip.
  • sendfile is synchronous but can be async with io_uring when splice support is added.
    • w sendfile the file content is still buffered, but in Kernel space using mmap
    • sendfile doesn’t not use DMA directly but there can be DMA happeneing in the leaf
      • DEVICE_TO_MEMORY
        • DMA copy from disk drive to fd_in buffer
      • WHAT SENDFILE DOES
        • fd_in buffer passes the page reference to fd_out buffer via the pipe
      • MEMORY_TO_DEVICE
        • Ethernet card went directly to the fd_in buffer for the actual data by looking at the references in the fd_out buffer.