tags : Systems

To read: https://x.com/7etsuo/status/1840268043982909838 🌟

FAQ

sharing of fd w child, how?

  • Each running process has its own fd table. But this is an exception with child processes.
  • After fork() or a clone() (wo CLONE_FILES set), a child and a parent have an equal set of fd(s)

What is file offset?

  • file offset == location for next read() / write()

Intro

FD is not related to an inode, except as such may be used internally by particular file-system driver.

  • FD is an abstract indicator to access a file or other input/output resources.
    • It’s 100% opaque +ve integers
    • Even if its called “file” descriptor it can be indicator to something which is not a file too (but in unix everything is a file makes it fuzzy)
      • Up to V7 UNIX (1973-1979[2,3]), the file description table could literally only reference a file on disk, UNIX domain/TCP/UDP sockets weren’t introduced until 4.2BSD.
  • It decouples a file path (more correctly, an inode) from a file object inside a process and the Linux kernel.
  • Allows for opening the same file
    • An arbitrary number of times
    • For different purposes
    • With various flags
    • At different offsets.
  • Each running program has its own list of file descriptors; they aren’t shared.

std[in,out,err]

  • /dev/stdin, /dev/stdout, /dev/stderr are filenames for fd for each process.
  • /proc/self/fdinfo contains per file descriptor info.
λ ll /dev/ | rg fd
lrwxrwxrwx      13 root      13 Mar 18:45  fd -> /proc/self/fd
lrwxrwxrwx      15 root      13 Mar 18:45  stderr -> /proc/self/fd/2
lrwxrwxrwx      15 root      13 Mar 18:45  stdin -> /proc/self/fd/0
lrwxrwxrwx      15 root      13 Mar 18:45  stdout -> /proc/self/fd/1
λ ll /proc/self/fd/
lrwx------    64 geekodour 15 Mar 15:44  0 -> /dev/pts/1
lrwx------    64 geekodour 15 Mar 15:44  1 -> /dev/pts/1
lrwx------    64 geekodour 15 Mar 15:44  2 -> /dev/pts/1
λ lsof -d 0 +fg # same fd points to different files
COMMAND      PID      USER   FD   TYPE FILE-FLAG DEVICE SIZE/OFF    NODE NAME
systemd      663 geekodour    0r   CHR        LG    1,3      0t0       4 /dev/null
emacs        676 geekodour    0r   CHR        LG    1,3      0t0       4 /dev/null
alacritty    933 geekodour    0u   CHR  RW,AP,LG    4,1      0t0      20 /dev/tty1
fish         947 geekodour    0u   CHR     RW,ND  136,0      0t0       3 /dev/pts/0
λ cat /proc/self/fdinfo/{0,1,2}
pos:    0
flags:  02002
mnt_id: 29
ino:    7

TODO The system fd and per process fd table and inode

Related syscalls: dup, dup2, dup3, fcntl (also allows us to specify certain fd number)

The tables

NameLevelOther Names
descriptor tableper processPer process table
file tablesystem wideOpen FD(OFD) table, Global FD table, System FD table
v-node tablesystem wideinode table

Open FD table (OFD table)

It’s an abstract thing, no actual entity in the kernel.

  • Each entry stores status and position of the fd.

Per process FD table

This is a tangible thing

  • Multiple FDs in the same process referring to the same OFD. (man 2 dup)
  • Multiple processes w their own FDs referring to the same OFD. (man 2 fork)
    • If parent and child now start writing to the fd, the kernel will handle the synchronization
  • Multiple processes w their own FDs referring to distinct OFD, but OFD points to same inode. (man 2 open by both processes)

TODO Shared and Private

Properties/AttributesGlobal TablePer Process Table
Operation Flag(O_CLOEXEC)Private
Ref. to Global TablePrivate
File offsetSharedPrivate (mapped)
Access Mode(rw)SharedPrivate(mapped)
Ref. to inodeShared
  • Some properties are stored in the OFD and some in per process
  • If we change property of a FD from one process, and its shared, changes will reflect in other.

FD Internals

Usage of FD

Creating FD

  • Using open, openat, create etc. it’ll create the fd in both the tables.
  • When we create new fd, kernel grantees to return the lowest positive number not currently opened by the calling process. i.e if we close a fd of a file, the next fd we create will get the fd of the file that we closed.