tags : Linux, syscalls, O, Memory Allocation
Usecases
- Allocate
anon
memory (malloc
uses it internally) - Create efficient read and write of files vs
read()
andwrite()
- Glue some resource(portion of a file) to VAS of some process.
- Multiple processes can each map the same resource into their memory space
- IPC (
MAP_SHARED | MAP_ANONYMOUS
) - More control on permission
- Eliminates the protection domain crossing from system calls and
- kernel/userspace data copies that I/O system calls such as read() imply. ( See O )
Issues w using it in Databases
Transactional Safety
- OS can flush dirty pages at any time (even if the writing transaction hasn’t commited!)
- Best case scenario can be read-only
- Solution: OS CoW, Shadow paging
I/O Stalls
- Because with
mmap
, the OS manages your file I/O, you have no idea if your page is in memory or disk - Reading from disk can cause I/O stalls
Error Handing
- We want to validate checksum of a
page
when it’s loaded to memory - We want to check for corruption before writing things back to disk
- We cannot do these because we don’t have the access to that level when I/O is handled by the OS
- Trying to access mmap data can cause
SIGBUS
- Solution: Instead of having all these error handling in the
buffer pool
module, now all this error handling can be seen in rest of your codebase
Performance Issues
- TODO : Add details?
File backed vs non-file backed (for PostgreSQL)
- when using
MAP_ANONYMOUS
there’s no file based backing. Databases such as PostgreSQL use this forstatic shared memory allocation
in startup. - But using
mmap
is discouraged fordynamic_shared_memory
feature in PostgreSQL because it then needs to be backed by a file, when now because it’s backed by a file and managed by the os(linux)- it may write modified pages back to disk repeatedly, increasing system I/O load
- or it might cause inconsistency also (if it os flushes dirty pages!)
- Instead it recommends using
shm_open
(not a syscall)
Random notes
Interesting flags
MAP_SHARED
(Share w other processes)MAP_PRIVATE
(Uses Copy on Write)MAP_ANONYMOUS
(No file based backing, directly on RAM)
Precautionary Tales
- Use memory maps if you access data randomly, sparse reads.
read
files normally if you access data sequentially.
More resources
- c++ - mmap() vs. reading blocks - Stack Overflow 🌟
- Why Linux Has This Syscall?! - YouTube 🌟
- But how, exactly, databases use mmap?
- Why mmap is faster than system calls | by Alexandra (Sasha) Fedorova | Medium
- https://twitter.com/penberg/status/1352875939961700353 (has archive)
- Using mmap to make LLaMA load faster | Hacker News