File Systems and Storage Management

COMP 3000
Operating Systems
File Systems and Storage Management
Lianying Zhao
Memory vs. Storage
RAM has (relatively) low capacity, despite higher speed
In-RAM content disappears when powered off
COMP 3000 2
Why can’t a program
run directly on your
hard drive?
Storage device
abstractions are not
part of a computer
It’s I/O!
Why is a Driver Needed?
Storage devices as an example (simplified):
COMP 3000 3
The Block Device “Layer” (specific)
The actual media
Has its own drivers, e.g., USB Mass Storage
May even involve multiple layers
Block size can differ from the file system block size
Technically, file systems can “reside” on any block devices
Performance also matters
The write()system call usually doesn’t immediately cause a write to the block
The sync command
COMP 3000 4
The File System “Layer”
Abstraction: from “blocks” to “files”
For the time being, no ideal FS abstraction that enables full portability
Tight coupling with the OS kernel, cf. the file-based access control
COMP 3000 5
From now on, we will be examining file system concepts in the context of
UNIX-like OSes — VFS

Types of “Files”
Regular file
Symbolic link
FIFO (named pipe)
Device file (block, character)
COMP 3000 6
What is a File Descriptor (fd)?
COMP 3000 7
A value (non-negative integer), pointing to a data structure in the
Like indices to arrays of structs
We are not talking about resource handles in computing in general
HANDLE, in Windows
So stdin, stdout and stderr are just special ones among them
Tracing down the File Access
The file descriptor table
per process
The open file table
The i-node table
In-memory copy
COMP 3000 Credit: Michael Kerrisk 8
So What really is an inode?
A POSIX (VFS) concept
In some sense, the inode is the file
Identified by an inode number (unique within a file system)
inode types:
regular file
char device
block device
(named) pipe
symbolic link
COMP 3000 9
What is Stored in an inode?
inodes are data structures,
so they are real, even for
special files
They take space
They are in the file system
storage (although there’s an
in-memory copy)
COMP 3000 10
The stat Command
Display detailed information about files/directories
More than ls does
Mainly corresponding to the inode
System calls stat(), fstat(), lstat()
How do you find out if a file/directory exists?
COMP 3000 11
Directory entry (dentry)
Represents a directory entry (not necessarily a directory)
System calls – getdents(), not read()
Library calls – readdir()
A file is mapped to its inode by its parent directory
The root (/) directory’s inode number is always 2 **
COMP 3000 12
Hard Links and Symbolic Links
Symbolic link
Only linking to the target file name (more accurately: pathname)
What if the target file is deleted?
Hard link
Linking to the inode number
Everything identical, except difference names
Not to a directory (why?)
Link count
Comparing with MS Windows again…
Reparse points
COMP 3000 13
. and ..
File operations?

Copy creates a new inode
For move, it depends
Across different file systems, new inodes are created
Within the same file system, just relinked to the new pathname
Decreases link count, if greater than 1, and removes that directory entry
Removes the inode as well if link count = 1
COMP 3000 14
How do We Access Devices?
Special files!
Mostly /dev/*
Kernel-mode code behind each special file
Special files files on a special file system
E.g., \.PHYSICALDRIVE0 (Windows)
E.g., /dev/sda (Linux)
Evolution of node generation in /dev
Manually generated hardcoded nodes
COMP 3000 15
Device Files/Nodes (a.k.a. Special Files)
They represent physical or virtual hardware devices
A file system interface between device drivers and user-space applications
Identified by a major number and a minor number
Character devices
Accessed at the granularity of characters (bytes)
Not addressable (hence a stream)
Block devices
Accessed at the granularity of blocks
COMP 3000 16
Size = 0?
Metadata about the whole file system
Primary and backup superblocks
The dumpe2fs command
View superblock information
Must be a block device (where a file system resides)
COMP 3000 17
Blocks on a File System
Contains all meta data except the (file) name
Contains the mapping between file names
and inodes (dentries)
Also a special inode (with an inode number)
data blocks
COMP 3000 18
Physical and Logical Sizes
Logical size:
The actual size of the file
MS Windows: “Size”
Physical size:
The amount of allocated space on disk
MS Windows: “Size on disk”
“Holes” in a file
COMP 3000 19
Using the dd Command
Experimenting with real devices can be risky
So let’s use a “virtual” version
dd is a command-line utility to copy/convert data
Always involves an input file (if) and an output file (of)
Not necessarily regular files
Block devices (e.g., /dev/sda)
Other special files (e.g., /dev/null, /dev/random, /dev/zero)
COMP 3000 20
dd vs. cp
Well, they both copy files, so…
They have different positionings
cp: works at the granularity of files
Can handle multiple files/directories
dd: file I/O, more control over data is handled
Position control: seek (of), skip (if)
Conversion: e.g., encoding
Analogy: dd is like a file-based pipeline
COMP 3000 21
File Systems Can be Corrupted
All types of persistent storage share the same risk
On-disk data: lifespan is long and damages also persist
In-RAM data: lifespan is short and can also be recreated
What can happen:
Failures during updates: power failure or system crash inconsistency
Media/data damage
COMP 3000 22
crash consistency
A Few What-ifs
Interrupted right in the middle of updating on-disk structures
The crash consistency problem
Inodes are good, with missing/inconsistent data blocks
Good data blocks, with inodes missing or corrupted
You lose directory entries
The superblock is corrupted
COMP 3000 23
The Lazy Approach: Let it Happen and Fix it
The fsck tool checks:
Link count **
Bad blocks *
The lost+found directory
Caution! File system integrity vs. data integrity
COMP 3000 24
Journaling File Systems
Save the need for scanning the whole file system
At the cost of some performance+storage overhead
All changes must first be written to a log in persistent storage, before
applied to the actual data storage
Common file systems with journaling
Windows NTFS
ext3 and ext4
COMP 3000 25
Data Recovery?
Again, not to be confused with file system repair
Nor is it storage device (disk) repair…
Precondition: there must still exist the data in some form…
Back up your data properly
COMP 3000 26
Special File System: procfs (/proc)
Originally proposed in an academic paper in 1984
As its name implies: “each member of which, /proc/nnnnn, corresponds to the
address space of the running process whose pid is nnnnn.”
Gradually extended to a wide range of information about the system, e.g.,
But /proc/sys belongs to sysctl, to configure the kernel at run-time
COMP 3000 27
Special File System: sysfs (/sys)
A way to interact with: kernel subsystems, hardware devices, and
device drivers
Exposing the kobject structures internally to kernel code and files
externally to user space
sysfs_create_file() to create entries
COMP 3000 28
User-space File Systems
Convenience: e.g., the many programming languages
Security & stability
FUSE = Filesystem in USErspace
Another layer of abstraction
Can convert virtually anything into a file system
COMP 3000 29
static struct fuse_operations
operations = {

= do_getattr,
= do_readdir,
= do_read,

Network File Systems
SSHFS (a FUSE file system)
So far, many things through SSH
Network tunneling
Why: Showing remote files as local files
NFS (not a FUSE file system)
You need a dedicated server listening dedicated ports
Reasons for choosing it… performance, reliability, etc.
COMP 3000 30
More on Differences of File Systems
The permission bits
How come I can mount all/many kinds of file system on Linux and still
see the same view?
Possibility: the driver fakes it
fstype (-t) file system driver
Mount options
COMP 3000 31
Accessing Files/Directories Programmatically
File operations are based on file descriptors (FDs)
Manipulating FDs
Redirection of stdin/stdout/stderr
struct dirent (directory entry)
struct stat (file/inode, Tutorial 5)
COMP 3000 32
COMP 3000 33
COMP 3000
Operating Systems
RE: Tutorial 3
What if the signal handler is triggered during a system call?
System call aborts and returns an error
Signal handler waits until system call is finished
System call is paused, signal handler runs, and then system call is resumed
COMP 3000 (Fall 2020) 34