Databases can be large and need not fit in memory. Instead they must be stored in persistent, non-volatile storage. The DBMS must move data from non-volatile storage to volatile storage and vice versa.
There is a storage hierarchy:
| Type | Latency (nS) | Class |
|---|---|---|
| CPU Registers | 1 | CPU |
| CPU Caches | 4 | CPU |
| DRAM | 100 | Memory |
| SSD | 16,000 | Disk |
| HDD | 2,000,000 | Disk |
| Network Storage | 50,000,000 | Disk |
| Tape | 1,000,000,000 | |
| Higher storage medium are faster, smaller (fits less data), and more expensive. |
DRAM, CPU Caches, and CPU Registers are volatile which means they will not persist if something goes wrong like a power outage
Every time volatile memory is changed, the non-volatile memory must be updated
The storage manager is responsible for managing files
A file is a collection of disk pages. Unless otherwise mentioned, a page is the granularity of reads and writes to disk (hardware pages are usually 4KB, database ages are usually 512B-32KB)
Files musts support,
- Insert
- Update
- Delete
- Read
- Scan
Heap files have unordered collections of pages. If you have a single file and you know the location of your data, finding a particular page is easy. When you have multiple files, a directory (which is a special page or collection of pages) stores the addresses of pages as well as metadata.
Sorted files have ordered data according to some attribute.
To understand the differences, let’s look at average-cases, ignore CPU costs, and ignore further optimizations. Say the number of pages is :
| Operations | Heap file | Sorted file | Remark |
|---|---|---|---|
| Scan all records | |||
| Equality search | assumes 0 or 1 match | ||
| Range search | |||
| Insert | potentially need to shift everything over | ||
| Delete | |||
| Update | |||
| So, what’s in a page? | |||
| A page header contains metadata about the contents of the page |
- Page size
- Checksum
- Transaction history
- Compression metadata
- Schema information
- Data sketches
In some systems, pages are self-contained
Data in a page can be stored in many ways,
- Tuple-oriented
- Column-oriented
- Partition attribute across
If the tuple size is fixed then finding the th attribute is very easy, as well as storing them. However deleting a tuple is difficult and requires either moving tuples or finding a way to track empty slots. A bitmap of bits can be used to keep track of info for a page that stores entries.
If the tuple size is variable then things get difficult. Fields must be delimited by special symbols, although this creates another overheard for finding attributes. Some systems store all the offsets in the header, which requires more storage, but is quite efficient (and works nicely for NULL/empty values). Now, the question is how to store variable-sized tuples. The trick is to have a “slotted” page:
- Pointers are stored at the start
- When a tuple is added, it’s put at the end of the file, at the next empty slot
- When a tuple is removed, the space (and where the pointer was stored) stays empty
- A garbage collecting routine can compact the page
Limitations of tuple-oriented storage:
- Fragmentation (until GC or similar compacts the tuples)
- Superfluous disk I/O
- Lots of random I/O since columns are not stored together
HDD
Random access involves mechanical movement however disk provides cheap, non-volatile storage
- Move arm to track (seek latency)
- Rotate the platters to the correct sector (rotational latency)
- Read/write pages with the head ()
SSD
Operates electronically, garbage collection, updates out of place