Kotlin Coroutines Explained in Depth
(Suspension / Resumption / Cancellation / Dispatch / Exception Handling — A Complete Walkthrough)
1. Overview — What Is a Coroutine (in One Sentence)
Kotlin coroutines are a lightweight, user-space cooperative task framework. With compiler support, suspend functions are transformed into state machines that carry a Continuation object, allowing asynchronous logic to be written like synchronous code — without blocking threads. At each suspension point, execution state is saved and returned; when resumed, the coroutine continues via Continuation.resumeWith(...).
2. Core Concepts (Quick Glossary)
- suspend function — a function that can “suspend” execution; the compiler rewrites it into a function that implicitly accepts a
Continuation<T>. - Continuation<T> — the “resumption handle” of a coroutine; it stores context and the next execution step, and exposes
resumeWith(Result<T>). - CoroutineContext — runtime context for a coroutine (includes
Job,CoroutineDispatcher,CoroutineName,CoroutineExceptionHandler, etc.). - Job / SupervisorJob — lifecycle handles for coroutines; support
cancel(),join(), parent-child relations, and structured concurrency. - CoroutineDispatcher — defines how suspension/resumption is dispatched to threads or thread pools (
Dispatchers.Default,IO,Main,Unconfined, etc.). - suspendCoroutine / suspendCancellableCoroutine — primitives for creating suspend points from callback-style APIs (the cancellable version integrates with
Job.cancel()). - Structured Concurrency — parent scopes manage child coroutines, ensuring no “orphan” tasks remain and that exceptions/cancellation propagate properly.
- CancellationException — used to signal coroutine cancellation; treated as normal control flow (not as an uncaught exception).
3. Compiler View: How suspend Works
suspend fun foo(): T on the JVM is compiled into: foo(Continuation<T> continuation)
The compiler converts the function into a state machine (similar to async/await transpilers in JavaScript).
- At every
suspendpoint, local variables and the next state are stored in theContinuation(or a generated subclass), then it returns a special marker (COROUTINE_SUSPENDED). - When
resumeWithis later called, the state machine jumps back to the saved state and continues execution until the next suspend or completion.
In short:
suspenddoesn’t block threads — it slices execution into segments that can be paused and resumed later by the runtime.
4. Two Common Ways to Suspend Execution
4.1 Using Built-in suspend APIs
Functions like delay() or withContext() already provide suspension. They decide internally whether to immediately return COROUTINE_SUSPENDED and save the Continuation in a scheduler or timer queue.
4.2 Creating Custom Suspension Points
You can build your own suspendable APIs using callbacks:
suspend fun awaitCallback(): String = suspendCoroutine { cont ->
someAsyncApi { result, error ->
if (error != null) cont.resumeWithException(error)
else cont.resume(result)
}
}suspendCoroutine— not automatically linked to coroutine cancellation; callbacks may still fire after the coroutine is canceled.suspendCancellableCoroutine—- Provides
cont.invokeOnCancellation { ... }for cleanup when the coroutine is canceled. - The recommended approach for cancellable suspend points (e.g., canceling network calls).
- Provides
5. Resumption Mechanism
When Continuation.resumeWith(result) is invoked:
- The
ContinuationInterceptor(often aDispatcher) may intercept it and dispatch resumption to a thread or queue. - The saved state is restored, and the coroutine executes until the next suspend point or completion.
- If
resultcontains an exception, it’s re-thrown at the suspension point, triggeringtry/catchor handled by the coroutine context.
Note: resumeWith is thread-safe and should be called exactly once.
6. Dispatchers & Continuation Interceptors
CoroutineDispatcherimplementsContinuationInterceptor, deciding where coroutine code runs upon resumption.Common implementations:
Dispatchers.Default— shared work-stealing pool (ForkJoinPool-like).Dispatchers.IO— scalable pool for blocking I/O operations.Dispatchers.Main— UI thread dispatcher (Android / Desktop).Dispatchers.Unconfined— not confined to any thread; runs in the caller thread initially and resumes in the thread that triggers resumption.
Custom dispatchers can implement
CoroutineDispatcherand overridedispatch(context, block). ImplementingDelayadds timer support.
7. Cancellation — Principles and Practice
7.1 Cooperative Cancellation
Coroutine cancellation is cooperative: Job.cancel() marks the job as canceled but does not forcibly interrupt threads. The coroutine must check for cancellation at “cancellation points” and exit voluntarily.
Common cancellation points: suspend functions (yield(), delay(), withContext()), select, or manual checks (isActive, coroutineContext[Job]?.isCancelled).
7.2 CancellationException
- Propagated as a
CancellationException(internally viaresumeWithException(CancellationException())). - Treated as normal flow control — not as an unhandled fatal error.
- When using
try/catch, you can capture it but should handle it separately from real exceptions.
7.3 Why suspendCancellableCoroutine Is Better
- It supports cancellation hooks with
invokeOnCancellation. - Allows canceling underlying operations (e.g., canceling HTTP calls) when the coroutine’s
Jobis canceled.
7.4 Cleanup During Cancellation
If you need to perform suspending cleanup during cancellation, wrap it in withContext(NonCancellable):
try {
// work
} finally {
withContext(NonCancellable) {
// safe suspend cleanup
}
}8. Structured Concurrency & Job Hierarchy
coroutineScope {}andlaunch {}enforce structured concurrency:- Parent scopes wait for all child coroutines to finish.
- Parent cancellation propagates to children.
SupervisorJob/supervisorScope: child failures don’t affect siblings.
Example:
coroutineScope {
launch { fail() } // Cancels all siblings on failure
launch { doWork() }
}
supervisorScope {
launch { fail() } // Other children keep running
launch { doWork() }
}9. Exception Propagation & CoroutineExceptionHandler
Propagation rules:
- For root coroutines (
GlobalScope.launch, etc.), uncaught exceptions go toCoroutineExceptionHandler. - For structured scopes, exceptions bubble up to the parent coroutine, canceling siblings if not handled.
- For root coroutines (
Key differences:
launch: exceptions are immediate and cancel the parent.async: exceptions are deferred, only thrown when callingawait().
CoroutineExceptionHandler handles only uncaught top-level exceptions (not those in async that you later await).
10. Behavior of Key APIs
withContext(dispatcher) { ... }- A suspend function that switches context, runs the block, suspends/resumes as needed, and returns the result.
launch { ... }- Starts immediately, returns a
Job, and schedules execution via a dispatcher.
- Starts immediately, returns a
async { ... }- Returns
Deferred<T>; exceptions surface whenawait()is called.
- Returns
runBlocking { ... }- Blocks the current thread until the coroutine completes (used in main functions/tests).
11. Implementing a Cancellable Suspension
Basic (non-cancellable)
suspend fun awaitCallback(): String = suspendCoroutine { cont ->
val callback = object : Callback {
override fun onResult(result: String) = cont.resume(result)
override fun onError(t: Throwable) = cont.resumeWithException(t)
}
register(callback)
}Recommended (cancellable)
suspend fun awaitCancellable(): String = suspendCancellableCoroutine { cont ->
val call = createCancelableCall()
call.enqueue { result -> cont.resume(result) }
cont.invokeOnCancellation {
call.cancel() // Cancel underlying request
}
}12. Coroutine State Machine & Result Flow
Continuation.resumeWith(result)receives aResult<T>(success or failure).The flow:
continuation.intercepted()applies the interceptor (Dispatcher).- The
Continuationis dispatched (or runs immediately). - The state machine resumes and jumps to the saved state.
COROUTINE_SUSPENDED is an internal marker meaning “not finished yet.”
13. Why Coroutines Are Lightweight
- Coroutines are just objects + small stack snapshots — far cheaper than OS threads.
- Switching coroutines doesn’t involve kernel-level context switching — it’s simply queueing/resuming code on an existing thread.
14. Best Practices & Common Pitfalls
Cancellation is cooperative — check
isActiveor callyield()in long-running tasks.Don’t run blocking I/O on
Dispatchers.Default; useIOor custom pools.Use
withContext(NonCancellable)for suspend cleanup infinally.Exception handling:
try/catchinsidelaunchblocks;asyncexceptions appear only when callingawait().
Always use
suspendCancellableCoroutinefor cancellable suspensions and cleanup.Avoid
GlobalScopeunless global lifetime is intended.For testing, use
runTestandTestDispatcherfor deterministic control.
15. Advanced Topics (Brief Overview)
- Select — waits for the first of multiple suspending operations to complete (race handling).
- Channels / Actors — coroutine-based communication primitives (similar to Go channels).
- Flow — cold asynchronous streams with backpressure and operators (analogous to Rx Observables).
- Debug Probes — runtime tracing and leak detection (
DebugProbes.install, etc.). - Cancellation Propagation Optimizations — explore
CoroutineStart.LAZY,SupervisorJob, and custom scopes.
16. Full Example — Network Request with Timeout and Cleanup
suspend fun fetchDataWithTimeout(): String = coroutineScope {
val job = launch {
// optional monitoring
}
try {
withTimeout(5_000) {
suspendCancellableCoroutine<String> { cont ->
val call = httpClient.newCall(request)
call.enqueue(object : Callback {
override fun onResponse(call: Call, response: Response) {
cont.resume(response.body!!.string())
}
override fun onFailure(call: Call, e: IOException) {
cont.resumeWithException(e)
}
})
cont.invokeOnCancellation {
call.cancel() // cancel underlying HTTP call
}
}
}
} finally {
withContext(NonCancellable) {
// cleanup or reporting
}
}
}17. Summary — Key Takeaways
- The Kotlin compiler rewrites
suspendfunctions intoContinuation-based state machines — enabling synchronous-style asynchronous code without blocking. - Suspension saves state and returns; resumption continues via
Continuation.resumeWith, dispatched by theDispatcher. - Cancellation is cooperative, requiring explicit checks or cancellable primitives (
suspendCancellableCoroutine). - Structured concurrency (
coroutineScope,SupervisorJob) ensures controlled lifecycles and predictable exception semantics. - Exception handling differs from Rx-style flows — understand
try/catch,CancellationException, andCoroutineExceptionHandler.