How Code Generation Works
The code generators translate AST structures into VDBE opcode instructions. They call sqlite3VdbeAddOp*() (defined in vdbeaux.c) to append opcodes to the Vdbe object's instruction array. Opcodes use a register machine model: each instruction has up to three integer operands (P1, P2, P3) and an optional pointer (P4).
Code generation functions receive a Parse* context which holds a pointer to the Vdbe being built. Labels are resolved via backpatching — a jump target is recorded as a placeholder and filled in later when the destination address is known.
pParse->nMem is the next free register. Code generators increment it to allocate registers for temporary values, cursor positions, and column results.
Opcode Emission Functions (vdbeaux.c)
All opcode emission goes through a family of functions in vdbeaux.c. They append to p->aOp[] and return the index of the new instruction (used for backpatching jumps).
/* vdbeaux.c:263 — opcode emission (simplified) */
int sqlite3VdbeAddOp2(Vdbe *p, int op, int p1, int p2){
int i;
Op *pOp;
i = p->nOp;
p->nOp++;
/* grow p->aOp[] if needed */
pOp = &p->aOp[i];
pOp->opcode = (u8)op;
pOp->p1 = p1;
pOp->p2 = p2;
pOp->p3 = 0;
pOp->p4.p = 0;
pOp->p4type = P4_NOTUSED;
return i; /* return address for backpatching */
}
SELECT Code Generation (select.c)
sqlite3Select() is the largest and most complex code-generation function. It handles simple SELECTs, subqueries, compound queries (UNION/INTERSECT/EXCEPT), aggregates, and window functions.
Addr Opcode P1 P2 P3 Comment ---- ------------- ---- ---- ---- ------------------------- 0 Init 0 13 0 Start; jump to P2 if already run 1 OpenRead 0 2 0 Open cursor 0 on table "users" (root page 2) 2 Rewind 0 12 0 Move to first row; jump P2 if empty 3 Column 0 0 1 r[1] = users.id (column 0) 4 Integer 1 2 0 r[2] = 1 5 Ne 2 11 1 if r[1]!=r[2] jump to addr 11 (next row) 6 Column 0 1 3 r[3] = users.name (column 1) 7 ResultRow 3 1 0 Output r[3..3+1] as a result row 8 Next 0 3 0 Advance cursor; loop back to addr 3 9 Goto 0 12 0 fall through to halt 10 Halt 0 0 0 11 Goto 0 2 0 start over 12 Halt 0 0 0 SQLITE_DONE
The actual generated bytecode uses an index scan or full-table scan depending on what indices exist, which is decided by the WHERE-clause optimizer in where.c.
WHERE Clause Optimizer (where.c)
The WHERE clause is the most important optimization target. where.c (7,886 lines) analyzes the WHERE expression, picks the best index for each table in the FROM list, and generates the loop structure that drives row iteration.
/* Pattern used by every query with a WHERE clause */
WhereInfo *pWInfo = sqlite3WhereBegin(
pParse, /* parse context */
pTabList, /* FROM clause (SrcList) */
pWhere, /* WHERE expression */
pOrderBy, /* ORDER BY (may be NULL) */
...
);
/* Code generator emits column extraction + result here */
sqlite3ExprCode(pParse, pEList->a[i].pExpr, reg);
sqlite3VdbeAddOp2(v, OP_ResultRow, regResult, nCol);
sqlite3WhereEnd(pWInfo); /* close loops, emit OP_Next/OP_Prev */
sqlite3WhereBegin() emits OP_OpenRead (or OP_OpenWrite), OP_Rewind/OP_SeekGE, and the loop head. sqlite3WhereEnd() emits OP_Next and backpatches all jump addresses.
where.c computes an estimated cost for each possible scan strategy and picks the lowest-cost plan. It considers full-table scans, index range scans, covering indexes, and multi-table join orderings.
INSERT / UPDATE / DELETE (insert.c · update.c · delete.c)
OpenWrite 0 rootPage nCols -- open write cursor on table Integer 1 r[1] -- r[1] = 1 String8 0 r[2] "hello" -- r[2] = 'hello' NewRowid 0 r[3] -- r[3] = new rowid (auto-increment) MakeRecord r[1] 2 r[4] -- r[4] = encoded row from r[1..2] Insert 0 r[4] r[3] -- btree insert: key=r[3], data=r[4] Close 0 -- close cursor
OpenWrite 0 rootPage nCols -- open write cursor SeekRowid 0 end r[1] -- seek to rowid=r[1]; jump if not found Delete 0 OPFLAG_NCHANGE -- delete current row under cursor Next 0 top -- loop (would be full scan without index)
Expression Code Generation (expr.c)
expr.c (~6,000 lines) generates VDBE code for every expression type: arithmetic, comparisons, function calls, subqueries, CASE expressions, and CAST. It is called by all DML code generators.
/* expr.c:5897 — entry point; calls sqlite3ExprCodeTarget internally */
void sqlite3ExprCode(Parse *pParse, Expr *pExpr, int target){
/* target = destination register number */
int inReg = sqlite3ExprCodeTarget(pParse, pExpr, target);
if( inReg!=target ){
/* move result to requested register if generated elsewhere */
sqlite3VdbeAddOp2(pParse->pVdbe, OP_SCopy, inReg, target);
}
}
/* sqlite3ExprCodeTarget dispatches on pExpr->op:
TK_INTEGER → OP_Integer
TK_FLOAT → OP_Real
TK_STRING → OP_String8
TK_COLUMN → OP_Column (reads from a B-tree cursor)
TK_AND → recurse left + right, OP_And
TK_FUNCTION → OP_Function
TK_SELECT → sqlite3CodeSubselect() ...
*/
Finalization (build.c)
After all statement code is generated, sqlite3FinishCoding() performs final bookkeeping: it emits the OP_Halt instruction, resolves any outstanding forward-jump labels, and hands the completed Vdbe* back to sqlite3Prepare().
Next Stage
The completed Vdbe* object (opcode array + register count) is returned as the prepared statement. sqlite3_step() then hands it to sqlite3VdbeExec() to execute.