How NTFS Works
This is only a brief reference pieced together after many long hours studying the real documentation.
Journalling
NTFS is what we call a journalling file system. It breaks down file system operations such as creating a file, writing data to a file, or deleting a file into what we call "transactions." No single operation is atomic, so any operation can fail in the middle if the disk were to lose power. The transactions are logged before they are carried out, and if the disk should lose power in the middle of a transaction, NTFS can either roll back the transaction or complete it. This preserves the integrity of the file system.
Master File Table
NTFS stores files descriptors in a Master File Table (MFT). Each descriptor is called an "inode" and contains metadata for the file such as filename, creation date, and location on disk. The first 16 entries of the table store filesystem metadata such as the root directory entry, allocated clusters, and quota information. The table itself is a file called $MFT. The boot sector of the partition (sector 0 in that partition) contains a reference to $MFT and its sibling $MFTMirr. The MFT Mirror, $MFTMirr, contains copies of the first 4 entires in the MFT: $MFT, $MFTMirr, $LogFile, and $Volume. If any of the first four entries in $MFT become corrupt, their backups can be restored from $MFTMirr. Unfortunately, the other inodes have no backups.
The size of each inode is typically 1 KB. The file $Boot (the boot sector, sector 0) records the actual size. Sometimes the metadata becomes large enough that it will not fit inside a single inode, and NTFS allocates multiple inodes to the file. In this case, the extension inodes have a link to the original inode.
File Attributes
Each inode stores file attributes, or file metadata. The more interesting ones are the filename attribute, data attribute, index root attribute, and index allocation attribute. The filename attribute is straightforward; it stores the name of a file. The data attribute encapsulates the user-visible file data. Indexes will be discussed later. Attributes can be resident or non-resident. Resident attributes are contained within the inode. Non-resident attributes are listed in the inode, but the actual data is stored elsewhere on-disk. The inode contains "data runs" that describe the location(s) and length(s) of clusters.
Because the data is actually stored as an attribute, small files (< 1 KB) can potentially become a resident data attribute. Larger files will use a non-resident data attribute.
Data Runs and Clusters
The disk is divided into clusters. NTFS supports cluster sizes from 512 bytes (1 sector) to 64 K (128 sectors). A typical cluster size is 4 KB. The cluster size can be found from $Boot. A data run is a list of offsets and lengths the describe the fragments of an attribute. An unfragmented attribute will usually have one entry.
NTFS supports "sparse" files, files that have a lot of unused space. This is common in databases. When NTFS detects clusters within a file composed entirely of zeros, it does not store them. These "sparse" fragments are recorded in the data runs and handled automatically by the filesystem.
Directories and Indexes
Directories have traditionally been a special form of file. In NTFS, directories are a special form of an index. An index caches attributes from an arbitrary collection of files. In the case of a directory, this collection represents the filename attributes from all the files within the directory.
An index is a tree such that an in-order traversal yields a sorted list. The index root attribute contains the head node in the tree. The index allocation attribute is optional and contains additional nodes within the tree. (Note: if the tree has depth > 1, an index allocation attribute is required.)
Putting it Together
When NTFS mounts a partition, it needs first to locate the MFT. If the $MFT link in the boot sector is corrupt, it will attempt to use $MFTMirr. The first entry in $MFTMirr is the inode for $MFT. Next, it obtains the data runs for the MFT. The MFT need not be contiguous, though ideally it is.
Next, NTFS needs to parse the file tree. The root directory entry, inode 6, is the directory entry for the root of the partition. NTFS can obtain links to subdirectories or files in the root directory. NTFS is now able to recurse through the file tree by reading the directories.
To obtain write-access to the disk, NTFS needs to know which clusters on the disk are allocated and which are not. The metadata file $Bitmap, entry 7, contains a bit array representing the clusters on the volume. If a bit is high (1), then it corresponds to an allocated cluster. If a bit is low (0), then it corresponds to an unallocated cluster. NTFS can now allocate clusters to file attributes as needed.
To create a file, NTFS examines the MFT. Each inode contains an allocation flag describing its status. If unallocated, NTFS can use the inode for a new file. If the table is full, NTFS can expand it. To prevent the MFT from becoming fragmented, the Windows NTFS driver will maintain a buffer zone around the MFT. It will not allocate from this zone until the remainder of the disk is full. As the free space diminishes, it relaxes this zone.