Operating Systems 10EC65
Unit 6
File Systems
Reference Book: “Operating Systems - A Concept based Approach” D. M. Dhamdhare, TMH, 3rd Edition, 2010
Shrishail Bhat, AITM Bhatkal
Introduction • Computer users store programs and data in files so that they can be used conveniently and preserved across computing sessions • The resources used for storing and accessing files are I/O devices • Operating systems organize file management into two components: – File System – Input-Output Control System (IOCS)
• A file system provides facilities for creating and manipulating files, for ensuring reliability of files when faults such as power outages or I/O device malfunctions occur, and for specifying how files are to be shared among users • The IOCS provides access to data stored on I/O devices and good performance of I/O devices Shrishail Bhat, AITM Bhatkal
Overview of File Processing
• File processing refers to the general sequence of operations of opening a file, reading data from the file or writing data into it, and closing the file
Shrishail Bhat, AITM Bhatkal
Overview of File Processing (continued)
Shrishail Bhat, AITM Bhatkal
Overview of File Processing (continued) • Figure 13.1 shows the arrangement through which an OS implements file processing activities of processes • Each directory contains entries describing some files • The directory entry of a file indicates the name of its owner, its location on a disk, the way its data is organized, and which users may access it in what manner • The code of a process đ?‘ƒđ?‘– is shown in the left part of Figure 13.1 • When it opens a file for processing, the file system locates the file through the directory structure, which is an arrangement of many directories • In Figure 13.1, there are two files named beta located in different directories • When process đ?‘ƒđ?‘– opens beta, the manner in which it names beta, the directory structure, and identities of the user who initiated process đ?‘ƒđ?‘– will together determine which of the two files will be accessed Shrishail Bhat, AITM Bhatkal
Overview of File Processing (continued)
• A file system provides several file types. Each file type provides its own abstract view of data in a file – we call it a logical view of data. • The IOCS organizes a file’s data on an I/O device in accordance with its file type. It is the physical view of the file’s data. • The mapping between the logical view of the file’s data and its physical view is performed by the IOCS
Shrishail Bhat, AITM Bhatkal
File System and the IOCS
• File system views a file as a collection of data that is owned by a user, shared by a set of authorized users, and reliably stored over an extended period • IOCS views it as a repository of data that is accessed speedily and stored on I/O device that is used efficiently
Shrishail Bhat, AITM Bhatkal
File System and the IOCS (continued)
Shrishail Bhat, AITM Bhatkal
File System and the IOCS (continued)
• A file system has two kinds of data – File data (or simply data) – data contained within files – Control data (metadata)– data used to access files
Shrishail Bhat, AITM Bhatkal
File Processing in a Program
• At programming language level: – A file is an object with attributes describing organization of its data and the method of accessing the data – A program contains a declaration statement for a file, which specifies values of its attributes, and statements that open it, perform read/write operations on it, and close it
• During execution of the program, file processing is actually implemented by library modules of the file system and the IOCS Shrishail Bhat, AITM Bhatkal
File Processing in a Program (continued)
Shrishail Bhat, AITM Bhatkal
Files and File Operations • File types can be grouped into two classes: – Structured files: Collection of records
• Record: collection of fields • Field: contains a single data item • Each record is assumed to contain a unique key field
– Byte stream files: “Flat”
• A file has attributes, stored in its directory entry Shrishail Bhat, AITM Bhatkal
Files and File Operations (continued)
Shrishail Bhat, AITM Bhatkal
Layers of the Input-Output Control System
• Input-output control system (IOCS) holds some file data in memory to provide efficient file processing and high device throughput
Shrishail Bhat, AITM Bhatkal
Layers of the Input-Output Control System (continued)
• Two layers: – Access method layer provides efficient file processing – Physical IOCS layer provides high device throughput
Shrishail Bhat, AITM Bhatkal
Layers of the Input-Output Control System (continued)
Shrishail Bhat, AITM Bhatkal
Overview of I/O Organization
• Three modes of performing I/O operations – programmed mode, interrupt mode, and direct memory access (DMA) mode
• In DMA mode, I/O devices are connected to device controllers, which are in turn connected to the DMA controller • Each device controller has a unique numeric id – Each device connected to it has a unique numeric device id
• A device address is (controller_id, device_id) Shrishail Bhat, AITM Bhatkal
Overview of I/O Organization (continued) • An I/O operation involves: – – – –
Operation to be performed—read, write, etc. Address of the I/O device Number of bytes of data to be transferred Addresses of areas in memory and on the I/O device that are to participate in the data transfer
• When an I/O operation is performed in the DMA mode, the CPU initiates the I/O operation, but it is not involved in data transfer between an I/O device and memory • To facilitate this mode of I/O, an I/O operation is initiated by executing an I/O instruction • The I/O instruction points to a set of I/O commands that specify the individual tasks involved in the data transfer Shrishail Bhat, AITM Bhatkal
I/O Operations
• The I/O operation to read the data recorded in a disk block with the id (track_id, block_id) is performed by executing: – I/O-init (controller_id, device_id), I/O_command_addr
• I/O_command_addr is start address of the memory area containing the following two I/O commands: – Position disk heads on track track_id – Read record record_id into the memory area with the start address memory_addr Operating Systems, by Dhananjay Dhamdhere Operating Systems, by Dhananjay
Dhamdhere
14.19
19 Shrishail Bhat, AITM Bhatkal
Fundamental File Organizations and Access Methods
• Fundamental record access patterns: – Sequential access – records are accessed in the order in which they fall in a file (or in the reverse of that order) – Random access – records may be accessed in any order
• File organization is a combination of two features: – Method of arranging records in a file – Procedure for accessing them
• Accesses to files governed by a specific file organization are implemented by IOCS module called access method Shrishail Bhat, AITM Bhatkal
Sequential File Organization
• Records are stored in an ascending or descending sequence according to the key field • Record access pattern of an application is expected to follow suit • Two kinds of operations: – Read the next (or previous) record – Skip the next (or previous) record
• Uses: – When data can be conveniently presorted into an ascending or descending order – For byte stream files Shrishail Bhat, AITM Bhatkal
Direct File Organization
• Provides convenience/efficiency of file processing when records are accessed in a random order • Files are called direct-access files • Read/write command indicates value in key field – Key value is used to generate address of record in storage medium
• Disadvantages: – Record address calculation consumes CPU time – Some recording capacity of disk is wasted – Dummy records exist for key values that are not in use Shrishail Bhat, AITM Bhatkal
Example: Sequential and Direct Files
• Employees with the employee numbers 3, 5–9 and 11 have left the organization – Direct file has dummy records for them
Shrishail Bhat, AITM Bhatkal
Index Sequential File Organization • An index helps determine location of a record from its key value • Pure indexed organization – the index of a file contains an index entry with the format (key value, disk address) • Index sequential organization is a hybrid organization that combines elements of the indexed and the sequential file organizations • It uses index to identify section of disk surface that may contain the record – Records in the section are then searched sequentially Shrishail Bhat, AITM Bhatkal
Index Sequential File Organization (continued) • To locate the record with a key k, first the higher-level index is searched to locate the group of tracks that may contain the desired record • The track index for the tracks of the group is now searched to locate the track that may contain the desired record, and the selected track is searched sequentially for the record with key k • The search ends unsuccessfully if it fails to find the record on the track
Shrishail Bhat, AITM Bhatkal
Access Methods
• Access method: IOCS module that implements accesses to a class of files using a specific file organization – Procedure determined by file organization – Advanced I/O techniques are used for efficiency: • Buffering of records – Records of an input file are read ahead of the time when they are needed by a process
• Blocking of records – A large block of data, whose size exceeds the size of a record in the file, is always read from, or written onto, the I/O medium
Shrishail Bhat, AITM Bhatkal
Directories
• A directory contains information about a group of files • Each entry in a directory contains the attributes of one file, such as its type, organization, size, location, and the manner in which it may be accessed by various users in the system
Shrishail Bhat, AITM Bhatkal
Directories (continued)
Shrishail Bhat, AITM Bhatkal
Directories (continued)
• File system needs to grant users: – File naming freedom – File sharing
• File system creates several directories – Uses a directory structure to organize them • Provides file naming freedom and file sharing
Shrishail Bhat, AITM Bhatkal
Directory Trees
• Some concepts: home directory, current directory • Path names used to uniquely identify files – Relative path name – Absolute path name Shrishail Bhat, AITM Bhatkal
Directory Graphs
• Tree structure leads to a fundamental asymmetry in the way different users can access a shared file – Solution: use acyclic graph structure for directories • A link is a directed connection between two existing files in the directory structure
Shrishail Bhat, AITM Bhatkal
Operations on Directories
• Most frequent operation on directories: search • Other operations are maintenance operations like: – Creating or deleting files – Updating file entries (upon a close operation) – Listing a directory – Deleting a directory
• Deletion becomes complicated when directory structure is a graph – A file may have multiple parents – File system maintains a link count with each file Shrishail Bhat, AITM Bhatkal
Organization of Directories
• Flat file that is searched linearly inefficient • Hash table directory efficient search – Hash with open addressing requires a single table – (Sometimes) at most two comparisons needed to locate a file – Cumbersome to change size, or to delete an entry
• B+ tree directory fast search, efficient add/delete – m-way search tree where m ≤ 2×d (d: order of tree) – Balanced tree: fast search – File information stored in leaf nodes – Nonleaf nodes of the tree contain index entries Shrishail Bhat, AITM Bhatkal
Directory as a B+ tree
Shrishail Bhat, AITM Bhatkal
Mounting of File Systems • There can be many file systems in an OS • Each file system is constituted on a logical disk – i.e., on a partition of a disk
• Files can be accessed only when file system is mounted • The mount operation is what “connects” the file system to the system’s directory structure.
Shrishail Bhat, AITM Bhatkal
File Protection
• Users need controlled sharing of files – Protection info field of the file’s directory entry used to control access to the file
• Usually, protection info. stored in access control list – List of (<user_name>,<list_of_access_privileges>) • User groups can be used to reduce size of list
• In most file systems, privileges are of three kinds: – Read – Write – Execute Shrishail Bhat, AITM Bhatkal
Allocation of Disk Space
• Disk space allocation is performed by file system • Before contiguous memory allocation model – Led to external fragmentation
• Now noncontiguous memory allocation model – Issues: • Managing free disk space – Use: free list or disk status map (DSM)
• Avoiding excessive disk head movement – Use: Extents (clusters) or cylinder groups
• Accessing file data – Depends on approach: linked or indexed Shrishail Bhat, AITM Bhatkal
Allocation of Disk Space (continued)
• The DSM has one entry for each disk block – Entry indicates if block is free or allocated to a file – Information can be maintained in a single bit • DSM also called a bit map
• DSM is consulted every time a new disk block has to be allocated to a file
Shrishail Bhat, AITM Bhatkal
Linked Allocation
• Each disk block has data, address of next disk block – Simple to implement – Low allocation/deallocation overhead
• Supports sequential files quite efficiently • Files with nonsequential organization cannot be accessed efficiently • Reliability is poor (metadata corruption) Shrishail Bhat, AITM Bhatkal
Linked Allocation (continued)
• MS-DOS uses a variant of linked allocation that stores the metadata separately from the file data • FAT has one element corresponding to every disk block in the disk – Penalty: FAT has to be accessed to obtain the address of the next disk block • Solution: FAT is held in memory during file processing Shrishail Bhat, AITM Bhatkal
Indexed Allocation
â&#x20AC;˘ An index (file map table (FMT)) is maintained to note the addresses of disk blocks allocated to a file â&#x20AC;&#x201C; Simplest form: FMT can be an array of disk block addresses Shrishail Bhat, AITM Bhatkal
Indexed Allocation (continued)
• Other variations: – Two-level FMT organization: compact, but access to data blocks is slower – Hybrid FMT organization: small files of n or fewer data blocks continue to be accessible efficiently
Shrishail Bhat, AITM Bhatkal
Performance Issues
• Issues related to use of disk block as allocation unit – Size of the metadata – Efficiency of accessing file data
• Both addressed using a larger unit of allocation – Use the extent as a unit of disk space allocation • Extent: set of consecutive disk blocks • Large extents provide better access efficiency – Problem: more internal fragmentation – Solution: variable extent sizes » Size is indicated in metadata
Shrishail Bhat, AITM Bhatkal
Interface Between File System and IOCS
• Interface between file system and IOCS consists of – File map table (FMT) – Open files table (OFT) – File control block (FCB)
Shrishail Bhat, AITM Bhatkal
Interface Between File System and IOCS (continued)
• The file system allocates disk space to a file and stores information about the allocated disk space in the file map table (FMT) – The FMT is typically held in memory during the processing of a file
• A file control block (FCB) contains all information concerning an ongoing file processing activity • The open files table (OFT) holds the FCBs of all open files – The OFT resides in the kernel address space so that user processes cannot tamper with it Shrishail Bhat, AITM Bhatkal
Interface Between File System and IOCS (continued)
Shrishail Bhat, AITM Bhatkal
Interface Between File System and IOCS (continued)
When alpha is opened: • File system copies FMTalpha in memory • Creates fcbalpha in the OFT • Initializes fields appropriately • Passes offset in OFT to process, as internal_idalpha
Shrishail Bhat, AITM Bhatkal
Interface Between File System and IOCS (continued)
Steps in file processing involving the file system and the IOCS 1. The process executes the call open (alpha, â&#x20AC;&#x2DC;read,â&#x20AC;&#x2122; <file_attributes>). The call returns with internal_idalpha. 2. The file system creates a new FCB in the open files table fcbalpha. The file system now makes a call iocs-open with internal_idalpha and the address of the directory entry of alpha as parameters. 3. The IOCS accesses the directory entry of alpha, and copies the file size and address of the FMT, or the FMT itself, from the directory entry into fcbalpha. Shrishail Bhat, AITM Bhatkal
Interface Between File System and IOCS (continued) 4. When the process wishes to read a record of alpha into area xyz, it invokes the read operation of the file system with internal_idalpha, <record_info>, and Ad(xyz) as parameters. 5. Information about the location of alpha is now available in fcbalpha. Hence the read/write operations merely invoke iocs-read/write operations. 6. The process invokes the close operation with internal_idalpha as a parameter. 7. The file system makes a call iocs-close with internal_idalpha. 8. The IOCS obtains information about the directory entry of alpha from fcbalpha and copies the file size and FMT address, or the FMT itself, from fcbalpha into the directory entry of alpha. Shrishail Bhat, AITM Bhatkal
File Processing (Implementing File Access)
• File System Actions at open – Sets up the arrangement involving FCB and OFT
• File System Actions during a File Operation – Performs disk space allocation if necessary
• File System Actions at close – Updates directories if necessary
Shrishail Bhat, AITM Bhatkal
File System Actions at open â&#x20AC;˘ The purpose of a call open (<path_name>, <processing_mode>, <file_attributes>), where <path_name> is an absolute or relative path name for a file <file_name>, is to set up the processing of the file â&#x20AC;˘ open performs the following actions: 1. It aborts the process if <processing_mode> is not consistent with the protection information for the file. Otherwise, it creates an FCB for the file <file_name> in the OFT, and puts relevant information in its fields. If <file_name> is a new file, it also writes <file_attributes> into its directory entry. 2. It passes the internal id of the file <file_name> back to the process for use in file processing actions. 3. If the file <file_name> is being created or appended to, it makes provision to update the fileâ&#x20AC;&#x2122;s directory entry when a close call is made by the process. Shrishail Bhat, AITM Bhatkal
File System Actions at open (continued)
• Perform path name resolution – For each component in the path name, locate the correct directory or file – Handle path names passing through mount points • A file should be allocated disk space in its own file system
– Build FCB for the file
• Retain sufficient information to perform a close operation on the file – Close may have to update the file’s entry in the parent directory – It may cause changes in the parent directory’s entry in ancestor directories Shrishail Bhat, AITM Bhatkal
File System Actions at open (continued)
Shrishail Bhat, AITM Bhatkal
File System Actions during a File Operation • Each file operation is translated into a call: – < opn > (internal_id, record_id,< IO_areaaddr >); • Internal_id is the internal id of <file_name> returned by the open call • Record_id is absent for sequential-access files – Operation is performed on the next record
• The file system performs the following actions to process this call: 1. Locate the FCB of <file_name> in the OFT using internal id. 2. Search the access control list of <file_name> for the pair (U, ...). Give an error if the protection information found in the file’s FCB does not permit user U to perform <opn> on the file. 3. Make a call on iocs-read or iocs-write with the parameters internal id, record id and <IO_area addr>. For nonsequential-access files, the operation is performed on the indicated record. For sequential-access files, the operation is performed on the record whose address is in the FCB field “address of the next record to be processed,” and the contents of this field are updated to point to the next record in the file. Shrishail Bhat, AITM Bhatkal
File System Actions at close (continued) â&#x20AC;˘ The file system performs the following actions when a process executes the statement close (internal id, ...) 1. If the file has been newly created or appended to. a. If it is a newly created file, create an entry for the file in the directory pointed to by the directory FCB pointer. If the directory entry format contains a field where the complete FMT can be stored, copy the FMT into this field; otherwise, first write the FMT into a disk block and copy the address of this disk block into the directory entry. b. If the file has been appended to, the directory entry of the file is updated by using directory FCB pointer. c. If necessary, repeat Steps 1b and 1c to update other directories in the path name of the file after setting file FCB pointer := directory FCB pointer and directory FCB pointer := address of parent directoryâ&#x20AC;&#x2122;s FCB found in the FCB of the file. If their FCBs were deleted after open, the directory files would have to be opened and updated.
2. The FCB of the file and FCBs of its parent and ancestor directories are erased from the OFT.
Shrishail Bhat, AITM Bhatkal
File System Actions at close (continued)
Shrishail Bhat, AITM Bhatkal
Case Study – Unix file system
• File system data structures – A directory entry contains only the file name – Inode of a file contains file size, owner id, access permissions and disk block allocation information – A file structure contains information about an open file • It contains current position in file, and pointer to its inode
– A file descriptor points to a file structure – Indexed disk space allocation uses 3 levels of indirection
• Unix file sharing semantics – Result of a write performed by a process is immediately visible to all other processes currently accessing the file Shrishail Bhat, AITM Bhatkal
Unix File System
Shrishail Bhat, AITM Bhatkal
Unix File System (continued)
• Disk Space Allocation – Unix uses indexed disk space allocation, with a disk block size of 4 KB – Each file has a file allocation table analogous to an FMT, which is maintained in its inode – The allocation table contains 15 entries – Twelve of these entries directly point to data blocks of the file. – The next entry in the allocation table points to an indirect block, i.e., a block that itself contains pointers to data blocks – The next two entries point to double and triple indirect blocks, respectively Shrishail Bhat, AITM Bhatkal
Unix File System (continued)
Shrishail Bhat, AITM Bhatkal
Berkeley Fast File System
• FFS was developed to address the limitations of the file system s5fs • Supports some enhancements like long file names and use of symbolic links • Includes several innovations concerning disk block allocation and disk access: – Permits use of large disk blocks (up to 8KB) – Uses cylinder groups to reduce disk head movement – Tries to minimize rotational latency when reading sequential files Shrishail Bhat, AITM Bhatkal