tags : Linux, syscalls, O

Usecases

  • Allocate anon memory (malloc uses it internally)
  • Create efficient read and write of files vs read() and write()
  • Glue some resource(portion of a file) to VAS of some process.
  • Multiple processes can each map the same resource into their memory space
  • IPC (MAP_SHARED | MAP_ANONYMOUS)
  • More control on permission
  • Eliminates the protection domain crossing from system calls and
    • kernel/userspace data copies that I/O system calls such as read() imply. ( See O )

Issues w using it in Databases

Transactional Safety

  • OS can flush dirty pages at any time (even if the writing transaction hasn’t commited!)
  • Best case scenario can be read-only
  • Solution: OS CoW, Shadow paging

I/O Stalls

  • Because with mmap, the OS manages your file I/O, you have no idea if your page is in memory or disk
  • Reading from disk can cause I/O stalls

Error Handing

  • We want to validate checksum of a page when it’s loaded to memory
  • We want to check for corruption before writing things back to disk
  • We cannot do these because we don’t have the access to that level when I/O is handled by the OS
  • Trying to access mmap data can cause SIGBUS
  • Solution: Instead of having all these error handling in the buffer pool module, now all this error handling can be seen in rest of your codebase

Performance Issues

  • TODO : Add details?

Random notes

Precautionary Tales

  • Use memory maps if you access data randomly, sparse reads.
  • read files normally if you access data sequentially.

Interesting flags

  • MAP_SHARED (Share w other processes)
  • MAP_PRIVATE (Uses Copy on Write)
  • MAP_ANONYMOUS

More resources