Spark — Runtime, Errors & Concurrency

8. Error handling and edge cases

Error handling patterns

Layer	Pattern	Examples
Core	Typed error objects	`SparkCoreErrors`, `SparkException`
SQL	`AnalysisException`, `QueryExecutionErrors`	Unresolved column, type mismatch
Tasks	Serialize failure reason back to driver	`ExceptionFailure`, `FetchFailed`
Scheduler	Retry with limits	`maxTaskFailures`, stage abort
Logging	Structured logging (Log4j2)	`spark.log.structuredLogging.enabled`

Task failure flow

Executor: task throws → TaskRunner catches → statusUpdate(FAILED, reason) → TaskSchedulerImpl.statusUpdate → if FetchFailed → DAGScheduler.handleTaskCompletion (resubmit stage) → else if attempts < max → retry on another executor → else → abort stage → fail job

Shuffle fetch failure

When a map output file is missing (executor lost), FetchFailed propagates to DAGScheduler, which invalidates the map stage and resubmits it. This is distinct from generic task failure — evidence in DAGScheduler.scala header comments (lines 106–114).

SQL analysis vs execution errors

Analysis: Thrown during analyzer.executeAndCheck before any job — fail fast in driver.
Execution: Data issues (divide by zero, corrupt files) surface as task failures or SparkException at action time.
Lazy analysis: Some errors deferred until action if isLazyAnalysis is true.

Security-sensitive paths

SecurityManager — RPC auth, servlet filters for UI
spark.authenticate, network crypto settings
Delegation tokens for YARN/HDFS (HadoopDelegationTokenManager)
User-provided code in UDFs — runs with full executor privileges (sandboxing not provided by default)
Spark Connect — session isolation via SparkConnectSessionManager

9. Concurrency and lifecycle

Concurrency model

Component	Model
`DAGScheduler`	Single-threaded event loop (`DAGSchedulerEventProcessLoop`)
`TaskSchedulerImpl`	Thread-safe; synchronized task set managers
`Executor`	Thread pool — one thread per task slot (`spark.executor.cores`)
`BlockManager`	Fine-grained locks per block; master RPC serialized
`SparkContext`	Documented as not thread-safe for all ops; SQL uses `withActive` session guard
Structured Streaming	Micro-batch driver thread + concurrent state store maintenance

Retries and timeouts

spark.task.maxFailures (default 4) — per-task retries
spark.network.timeout — RPC and shuffle timeout
spark.speculation — duplicate slow tasks on other nodes
RpcTimeout on endpoint asks — RpcTimeoutException
Tests: SparkFunSuite wraps tests in 20-minute timeout (spark.test.timeout)

Cancellation

sc.cancelJob(jobId), sc.cancelAllJobs()
SQL: spark.sql.execution.interruptOnCancel
Task kill via TaskScheduler.cancelTasks → executor KillTask message
Streaming: query.stop() gracefully commits or rolls back batch

Resource cleanup

SparkContext.stop() — stops DAGScheduler, TaskScheduler, RpcEnv, BlockManager
ContextCleaner — weak references to RDDs/shuffles for GC of unused lineage
ShutdownHookManager — JVM shutdown hook registered in SparkContext
Stage/job data structures cleared on completion (DAGScheduler invariant)
SQLExecution clears execution ID thread locals after actions

Dynamic allocation

ExecutorAllocationManager (core/.../scheduler/dynalloc/) requests/kills executors based on load when spark.dynamicAllocation.enabled=true. Requires external shuffle service for safe shrink.

Consistency assumptions

Not transactional across stages — output committers provide file-level exactly-once for writes
Streaming: At-least-once by default; exactly-once with transactional sinks + checkpoint
Cache: Best-effort replication (StorageLevel); not durable
Idempotency: Task retries must be deterministic or use commit protocols

11. Non-obvious insights

Hidden coupling

SparkEnv.get used throughout executors — hard to unit test without full env
SQL QueryExecution nested instances must share transaction-aware Analyzer
Python UDF performance depends on batch size and Arrow enablement — crosses Python/JVM boundary
Shuffle registration happens at ShuffleDependency construction, not at run time

Implicit conventions

Physical operators end with Exec suffix
Config keys defined as typed ConfigBuilder entries in internal/config
Tests named *Suite.scala extending SparkFunSuite
Module names in dev/sparktestsupport/modules.py drive CI test selection

Magic constants / globals

SparkContext.activeContext — one context per JVM
Default parallelism from spark.default.parallelism or executor cores × instances
UI port 4040 increments if busy
PYTHONHASHSEED=0 set in bin/spark-submit for deterministic Python hashing

Generated code

Catalyst expression codegen produces Java source strings, compiled at runtime via Janino
Connect protobufs generated at build time
Antlr SQL parser from grammar files

Performance-sensitive code

WholeStageCodegenExec — hot query path
UnsafeRow / common/unsafe — off-heap columnar bytes
SortShuffleWriter — disk I/O patterns for shuffle
TungstenAggregation — hash aggregation in SQL
Broadcast join threshold (spark.sql.autoBroadcastJoinThreshold)

Backward compatibility

Config entries often have legacy fallbacks (LEGACY_* configs)
MiMa (binary compatibility) checks on public artifacts via SBT plugin
Spark Connect protocol versioned separately from core
R API marked deprecated in README

Architecture diagram: failure domains

┌──────────── Driver failure ────────────┐ │ Lose entire app unless checkpoint/ │ │ streaming recovery from durable log │ └───────────────────────────────────────────┘ ┌──────────── Executor failure ─────────────┐ │ Tasks retried; shuffle blocks rebuilt │ │ if not using external shuffle service │ └───────────────────────────────────────────┘ ┌──────────── Task failure ─────────────────┐ │ Retry on another executor (bounded) │ └───────────────────────────────────────────┘

Next: Tests, build & learning path →