Databases can be large and need not fit in memory. Instead they must be stored in persistent, non-volatile storage. The DBMS must move data from non-volatile storage to volatile storage and vice versa.

There is a storage hierarchy:

TypeLatency (nS)Class
CPU Registers1CPU
CPU Caches4CPU
DRAM100Memory
SSD16,000Disk
HDD2,000,000Disk
Network Storage50,000,000Disk
Tape1,000,000,000
Higher storage medium are faster, smaller (fits less data), and more expensive.

DRAM, CPU Caches, and CPU Registers are volatile which means they will not persist if something goes wrong like a power outage

Every time volatile memory is changed, the non-volatile memory must be updated

The storage manager is responsible for managing files
A file is a collection of disk pages. Unless otherwise mentioned, a page is the granularity of reads and writes to disk (hardware pages are usually 4KB, database ages are usually 512B-32KB)
Files musts support,

  • Insert
  • Update
  • Delete
  • Read
  • Scan

Heap files have unordered collections of pages. If you have a single file and you know the location of your data, finding a particular page is easy. When you have multiple files, a directory (which is a special page or collection of pages) stores the addresses of pages as well as metadata.

Sorted files have ordered data according to some attribute.

To understand the differences, let’s look at average-cases, ignore CPU costs, and ignore further optimizations. Say the number of pages is :

OperationsHeap fileSorted fileRemark
Scan all records
Equality searchassumes 0 or 1 match
Range search
Insertpotentially need to shift everything over
Delete
Update
So, what’s in a page?
A page header contains metadata about the contents of the page
  • Page size
  • Checksum
  • Transaction history
  • Compression metadata
  • Schema information
  • Data sketches
    In some systems, pages are self-contained

Data in a page can be stored in many ways,

  1. Tuple-oriented
  2. Column-oriented
  3. Partition attribute across

If the tuple size is fixed then finding the th attribute is very easy, as well as storing them. However deleting a tuple is difficult and requires either moving tuples or finding a way to track empty slots. A bitmap of bits can be used to keep track of info for a page that stores entries.

If the tuple size is variable then things get difficult. Fields must be delimited by special symbols, although this creates another overheard for finding attributes. Some systems store all the offsets in the header, which requires more storage, but is quite efficient (and works nicely for NULL/empty values). Now, the question is how to store variable-sized tuples. The trick is to have a “slotted” page:

  • Pointers are stored at the start
  • When a tuple is added, it’s put at the end of the file, at the next empty slot
  • When a tuple is removed, the space (and where the pointer was stored) stays empty
  • A garbage collecting routine can compact the page

Limitations of tuple-oriented storage:

  • Fragmentation (until GC or similar compacts the tuples)
  • Superfluous disk I/O
  • Lots of random I/O since columns are not stored together

HDD

Random access involves mechanical movement however disk provides cheap, non-volatile storage

  1. Move arm to track (seek latency)
  2. Rotate the platters to the correct sector (rotational latency)
  3. Read/write pages with the head ()

SSD

Operates electronically, garbage collection, updates out of place