May 29th Lecture
=================

File:
 * flat array of bytes
 * grouped into directories; directories can hold other directories also

Hard Drive:
 * divided into fixed size blocks; usually 4k, 8k, 16k are common sizes on
   modern FS's
 * reading/writing sequential data is reasonably fast
 * seeking is slow (spin disk, move heads)
 * internally a drive may be very complex (multiple platters, moving heads,
   etc.), but the interface it provides to the OS is simple a large array of
   blocks that can be read/written

Storage devices can be partitioned; separate FS placed on each
 * multiple mount points (/home, /tmp, /usr/local)
 * multiple OS's (Linux, FreeBSD, Windows)

Filesystem:
 * superblock -- fixed location, identifies FS, stores important information
   about how to find data (block size, etc.)
 * inodes (index nodes) -- file metadata and pointers to data blocks
    - size
    - owner
    - permissions
    - timestamps
    - link count
    - *not filename*
 * data blocks
 * possibly other FS-specific structures (e.g., block allocation bitmap,
   journal, etc.)

ln src dest
 * Hard link from src to dest.  Both are now true names for the same file;
   neither one is special
 * file remains until last link is removed
 * any changes (including metadata) are reflected across all names

ln -s src dest
 * Symbolic link from src to dest (occasionally called "soft link" also)
 * separate inode for dest contains pointer to src's name
 * if src is removed, dest becomes a dangling link
 * permissions on dest are ignored; always appears as "lrwxrwxrwx"
 * can point at directories
 * can span filesystems

Data:
 * inode contains an array of block ID's
 * inodes are are fixed size structures; can't hold a huge array; since most
   files are small, we want to optimize for small files
 * first 12 data blocks pointed to directly by inode
 * single indirect block -- one more block of pointers pointed to by inode
 * double indirect, triple indirect block, extra layers of indirection for
   larger files
 * caching makes this cheaper than it sounds...indirect blocks may be cached

UFS1 was designed in early 80's, largest HD was only 330 MB
 * at that time 32-bit block ID's wasted space
 * 32-bits now runs out around 1-4 TB (depending on block size used); we're
   hitting the limit

UFS2 -- replacement for UFS1; not backward compatible
 - use 64-bit pointers
 - inodes now 256 bytes (as opposed to 128 bytes for UFS1)
 - added extended attributes: pointers to a block of variable-length,
   generic attributes
        * length
        * name length
        * name
        * value
 - possible uses:
        * ACL's (which previously had been hacked in through auxilary files)
        * Mandatory Access Control (MAC) and other data labelling
 - other improvements:
        * 64-bit timestamps (avoid Y2038 problem)
        * extra attributes

Extra attributes provided by UFS2:
 * immutable
 * append-only
 * no dump

Security States (superuser can raise, only init process can lower):
 * 0 (insecure) -- immutable & append-only can be turned off
 * 1 (secure) -- immutable & append-only can't be turned off, even by
   superuser; direct memory access via /dev/mem and /dev/kmem is now
   read-only
 * 2 (highly secure) -- same as level 1, plus raw disk devices
 * (/dev/ad0s1e...or /dev/hda1 to use linux terminology) are now
   read-only, even if unmounted