Operating Systems Lecture 18: Files and directories
Understanding File Systems in Operating Systems
Introduction to Files and Directories
- The upcoming lectures will focus on the file system component of operating systems, starting with an exploration of files and directories.
- A file is defined as an array of bytes stored persistently on a disk, identified by a human-readable name and an OS-level identifier known as the I node number.
- Each file has both a filename (user-defined) and a unique I node number assigned by the operating system, which is unique within its file system.
Understanding Directories
- A directory serves as a container for files and sub-directories, functioning similarly to a special type of file that holds names and identifiers of contained files.
- The structure of files and directories resembles a tree format, beginning from the root directory. Each file can be accessed via its path name.
File Operations: Creation and Access
- Files are created using the open system call with specific flags; this call returns a file descriptor that acts as a handle for future operations.
- Even existing files must be opened before they can be read or written to; this ensures access through their respective file descriptors.
Reading from and Writing to Files
- The read operation retrieves data sequentially from an opened file using its descriptor. Subsequent reads continue from where the last read left off.
- System calls for reading/writing require arguments such as the file descriptor, buffer location for data storage, and size specifications for how much data to process.
Advanced File Operations
- To jump to different parts within a file during reading or writing, one can use the seek system call (L seek).
- The F sync system call ensures that all writes made in memory are permanently saved onto disk.
Additional File System Operations
Understanding File System Operations
Overview of Executable Programs and System Calls
- The command
lsis an executable program that utilizes system calls to access and display files in a directory.
- A directory is treated like a file, containing entries with file names and inode numbers, which are read using the
readdirsystem call.
File Linking Concepts
- Linking multiple files allows different names (e.g.,
fileandfile2) to point to the same underlying inode number, sharing the same content.
- Removing one link (e.g., deleting
file) does not delete the actual data if another link (file2) still exists; this illustrates hard linking.
Hard Links vs. Soft Links
- A hard link creates a direct pointer from a filename to an inode, maintaining a reference count for how many filenames point to it.
- When all references to an inode are removed, the inode can be deleted by the system, demonstrating how unlinking works in file deletion.
- In contrast, soft links (or symbolic links), created with
ln -s, act as aliases for files but do not maintain direct pointers to inodes.
Mounting File Systems
- Mounting connects a disk's file system at a specific location in the directory tree, making all its files accessible from that mount point.
- Multiple file systems can coexist on a machine, each potentially using different types of file systems.
Memory Mapping Files
- Memory mapping via the
mmapsystem call allocates pages in a process's virtual address space for reading or writing data directly from/to files.
Understanding File Backed Pages vs. Anonymous Pages
Overview of Memory Mapping
- Definition of File Backed Pages: These pages are contrasted with anonymous pages, which do not have a backing file.
- Functionality of M-map: When a filename is provided as an argument to the
mmapfunction, it allocates a page in the virtual address space and copies the contents of the specified file into that page.
- Reading and Writing: Once a file is memory-mapped, it can be accessed like any other variable or piece of memory within the program. This allows for direct read/write operations.
- Accessing Data: The data from the mapped file can be treated as a large byte array, enabling reading or writing at any location within that array.