Stage 6 — Pager & WAL
pager.c · wal.c — Page cache, atomic commits, and crash recovery
Role of the Pager

The Pager sits between the B-tree engine and the OS layer. Its jobs are:

  • Page cache: keeps recently-used pages in memory to avoid repeated disk I/O.
  • Atomic commits: ensures that a transaction either fully commits or fully rolls back, even if the process crashes mid-write.
  • Concurrency: manages file-level locking so multiple readers don't block each other, and readers are isolated from writers.
  • WAL mode: an alternative journaling mode that allows concurrent readers while a writer is active.
Abstraction boundary: The B-tree calls sqlite3PagerGet(pgno) to get a page and sqlite3PagerWrite(pPage) before modifying it. The B-tree never writes to disk directly and is unaware of journals or locking.
Key Functions
pager.c:5726 — sqlite3PagerGet: fetch a page into the cache
int sqlite3PagerGet(
  Pager *pPager,   /* the pager open on the database file */
  Pgno pgno,       /* page number to fetch (1-based) */
  DbPage **ppPage, /* OUT: pointer to page in cache */
  int flags        /* PAGER_GET_NOCONTENT etc. */
){
  /* Dispatches to pPager->xGet, which is either
     pagerGetDbPage (rollback mode) or
     walRead         (WAL mode).
     Returns a pointer to a PgHdr held in the page cache. */
  return pPager->xGet(pPager, pgno, ppPage, flags);
}
pager.c:6234 — sqlite3PagerWrite: mark page dirty before modifying
int sqlite3PagerWrite(PgHdr *pPg){
  /* This must be called BEFORE the B-tree modifies page content.
     In rollback journal mode:
       1. If not yet in the journal, write the ORIGINAL content to
          the journal file (so it can be restored on rollback).
       2. Mark the page as dirty in the cache.
     In WAL mode:
       The page is copied to the WAL file on commit, not here. */
  Pager *pPager = pPg->pPager;
  if( pPager->eState == PAGER_WRITER_LOCKED ){
    return pager_write_pagelist(pPager, pPg);
  }
  pPg->flags |= PGHDR_DIRTY;
  ...
}
Rollback Journal Mode — Atomic Commit

In the default (rollback journal) mode, SQLite achieves atomic commits using a write-ahead journal of the original page content:

Commit sequence (rollback journal):

1. RESERVED lock obtained on database file.
2. For each page to be modified:
   a. Original page content written to journal file (-journal).
   b. Page modified in cache.
3. EXCLUSIVE lock obtained.
4. sqlite3PagerCommitPhaseOne():
   a. Journal file fsync'd (durable write of originals).
   b. All dirty pages written to database file.
5. sqlite3PagerCommitPhaseTwo():
   a. Journal file deleted (or truncated to 0).
   b. EXCLUSIVE lock released.

Crash recovery (on next open):
  If journal exists → read originals back → restore database to
  pre-transaction state → delete journal. Transaction as if it
  never happened.
WAL Mode (Write-Ahead Log)

When WAL mode is enabled (PRAGMA journal_mode=WAL), writes go to a separate -wal file instead of directly to the database. Readers always see a consistent snapshot; a writer never blocks readers.

WAL commit sequence:

1. New/modified pages are appended to the WAL file.
2. A "commit record" is written to the WAL.
3. WAL file is fsync'd.
4. Readers check the WAL for the latest version of each page;
   if not found in WAL, they read from the main database file.

WAL checkpoint (periodic):
  Pages in the WAL are written back to the main database file.
  Can happen automatically (default: every 1000 pages written).

Concurrency:
  ┌───────────┐     ┌───────────────┐     ┌──────────┐
  │  Readers  │────▶│  database.db  │     │  Writer  │
  │(no locks) │     │ (old content) │     │          │
  └───────────┘     └───────────────┘     └────┬─────┘
       │                                        │
       │         ┌──────────────┐               │ appends
       └────────▶│  -wal file   │◀──────────────┘
      (check WAL │  new pages   │
       first)    └──────────────┘
Page Cache (pcache.c)

The page cache is a separate module (pcache.c + pcache1.c) that implements an LRU eviction policy. The Pager delegates cache management to the sqlite3_pcache interface, which allows the application to provide a custom cache implementation.

/* Page header tracked by the pager (pagerInt.h) */
struct PgHdr {
  sqlite3_pcache_page *pPage; /* data managed by pcache */
  void *pData;                /* page content (4096 bytes) */
  void *pExtra;               /* extra per-page data for btree */
  PgHdr *pDirty;              /* linked list of dirty pages */
  Pager *pPager;              /* owning pager */
  Pgno pgno;                  /* page number */
  u16 flags;                  /* PGHDR_DIRTY, PGHDR_NEED_SYNC ... */
  ...
};
Next Stage

When the Pager needs to actually read bytes from or write bytes to the file, it calls through the OS / VFS abstraction layer.