Stochastic Calculus — Study Notes

Study notes for Lawler's Stochastic Calculus (Chapter 1 — Martingales in Discrete Time).

Sections 1.5 – Square Integrable Martingales, Random Walk Integrals, and Maximal Inequality

Notation at a Glance

Symbol Meaning
\(E[M_n^2] < \infty\) Square integrability — the hypothesis of §1.5
\(L^2(\Omega, \mathcal{F}, P)\) Hilbert space of square-integrable random variables with inner product \((X,Y) = E[XY]\)
\((X, Y) = E[XY]\) Inner product on \(L^2\) — orthogonality means \((X,Y) = 0\)
\(\Delta M_n = M_n - M_{n-1}\) Increment of \(M_n\) at step \(n\)
\(E[\Delta M_{n+1} \cdot \Delta M_{m+1}] = 0,\; m \neq n\) Orthogonality of martingale increments
\(J_n\) Predictable integrand — \(J_n\) is \(\mathcal{F}_{n-1}\)-measurable
\(Z_n = \sum_{j=1}^n J_j X_j\) Discrete stochastic integral of \(J\) with respect to the random walk \(S_n\)
\(\sigma^2 = E[X_j^2]\) Variance of each i.i.d. increment
\(\mathrm{Var}[Z_n] = \sigma^2 \sum_{j=1}^n E[J_j^2]\) Variance rule for the discrete stochastic integral
\(\bar{Y}_n = \max\{Y_0, Y_1, \ldots, Y_n\}\) Running maximum of a nonneg submartingale \(Y_n\)
\(\overline{M}_n = \max\{\lvert M_0 \rvert, \ldots, \lvert M_n \rvert\}\) Running maximum of \(\lvert M_n \rvert\)
\(P\{\bar{Y}_n \geq a\} \leq a^{-1} E[Y_n]\) Doob’s maximal inequality for submartingales
\(P\{\overline{M}_n \geq a\} \leq a^{-2} E[M_n^2]\) Doob’s \(L^2\) maximal inequality for square integrable martingales

Part 1 — Square Integrable Martingales

Definition — Square Integrable Martingale Definition

A martingale \(M_n\) is called square integrable if for each \(n\), \(E[M_n^2] < \infty\).

This is the condition that \(M_n \in L^2(\Omega, \mathcal{F}_n, P)\) at every time \(n\).

Why stronger than integrability: Square integrability \(E[M_n^2] < \infty\) implies integrability \(E[\lvert M_n \rvert] < \infty\) by Jensen’s inequality, but not vice versa.

Why weaker than uniform \(L^2\) boundedness: The definition requires \(E[M_n^2] < \infty\) for each fixed \(n\), but the bound may grow with \(n\). The stronger condition \(\sup_n E[M_n^2] \leq C < \infty\) (used in OST III and the MCT) is a separate, stricter requirement.

Orthogonality of increments

Random variables \(X, Y\) are orthogonal if \(E[XY] = E[X]\, E[Y]\).

For zero-mean random variables, orthogonality reduces to \(E[XY] = 0\). Independent random variables are orthogonal, but orthogonal random variables need not be independent.

Proposition 1.5.1 — Orthogonality of Martingale Increments Proposition

Suppose that \(M_n\) is a square integrable martingale with respect to \(\{\mathcal{F}_n\}\). Then if \(m < n\),

\[E[(\Delta M_{n+1})(\Delta M_{m+1})] = 0,\]

where \(\Delta M_k = M_k - M_{k-1}\). Moreover, for all \(n\),

\[E[M_n^2] = E[M_0^2] + \sum_{j=1}^n E\bigl[(\Delta M_j)^2\bigr].\]

Proof of orthogonality:

For \(m < n\), the increment \(\Delta M_{m+1} = M_{m+1} - M_m\) is \(\mathcal{F}_n\)-measurable (since \(m+1 \leq n\)). Therefore:

\[E[(\Delta M_{n+1})(\Delta M_{m+1}) \mid \mathcal{F}_n] = (\Delta M_{m+1})\, E[\Delta M_{n+1} \mid \mathcal{F}_n] = (\Delta M_{m+1}) \cdot 0 = 0.\]

The second equality uses the martingale property: \(E[\Delta M_{n+1} \mid \mathcal{F}_n] = E[M_{n+1} - M_n \mid \mathcal{F}_n] = 0\). Taking full expectations gives \(E[(\Delta M_{n+1})(\Delta M_{m+1})] = 0\).

Proof of the Pythagorean identity:

Write \(M_n = M_0 + \sum_{j=1}^n \Delta M_j\) and expand the square:

\[M_n^2 = M_0^2 + \sum_{j=1}^n (\Delta M_j)^2 + \sum_{j \neq k} (\Delta M_j)(\Delta M_k).\]

Taking expectations and using orthogonality (all cross terms vanish):

\[E[M_n^2] = E[M_0^2] + \sum_{j=1}^n E[(\Delta M_j)^2].\]
Interpretation: This is the Pythagorean theorem in $L^2$. The variance of $M_n$ equals the sum of variances of all its increments — because the increments are mutually orthogonal (uncorrelated), there are no cross-term contributions. This is the exact analogue of $\lvert a_1 e_1 + \cdots + a_n e_n \rvert^2 = a_1^2 + \cdots + a_n^2$ for orthonormal vectors.
⚠️Common Misconception
Wrong: "Orthogonal martingale increments are independent."
Correct: Orthogonality ($E[\Delta M_{n+1} \cdot \Delta M_{m+1}] = 0$ for $m \neq n$) is weaker than independence. It says the increments are uncorrelated, not that they have no dependence structure. Independence implies orthogonality (for zero-mean variables), but the converse fails. The proof only uses the martingale property — no independence assumption is needed.

Part 2 — Integrals with Respect to Random Walk

This section defines the discrete stochastic integral and establishes its three fundamental properties. The setting is a predictable integrand $J_n$ and a random walk $S_n$ with i.i.d. mean-zero increments. The integral $Z_n = \sum_{j=1}^n J_j X_j$ is the discrete prototype of the Itô integral $\int_0^t A_s\, dB_s$.

Setup

Suppose that \(X_1, X_2, \ldots\) are independent, identically distributed random variables with mean zero and variance \(\sigma^2\).

The two main examples are:

  • Coin-tossing: \(P\{X_j = 1\} = P\{X_j = -1\} = \tfrac{1}{2}\), giving \(\sigma^2 = 1\).
  • Normal increments: \(X_j \sim N(0, \sigma^2)\).

Let \(S_n = X_1 + \cdots + X_n\) and let \(\{\mathcal{F}_n\}\) be the filtration generated by \(X_1, \ldots, X_n\).

A sequence of random variables \(J_1, J_2, \ldots\) is called predictable (with respect to \(\{\mathcal{F}_n\}\)) if for each \(n\), \(J_n\) is \(\mathcal{F}_{n-1}\)-measurable.

This is the non-anticipating condition from §1.2: The integrand \(J_n\) is determined by observations strictly before time \(n\).

The discrete stochastic integral is defined by:

\[Z_n = \sum_{j=1}^n J_j X_j = \sum_{j=1}^n J_j \,\Delta S_j.\]

Three fundamental properties

Three Properties of the Discrete Stochastic Integral Proposition

Property 1 — Martingale property

\[Z_n \text{ is a martingale with respect to } \{\mathcal{F}_n\}.\]

Proof:

\[E[Z_{n+1} \mid \mathcal{F}_n] = E[Z_n + J_{n+1} X_{n+1} \mid \mathcal{F}_n] = Z_n + J_{n+1}\, E[X_{n+1} \mid \mathcal{F}_n] = Z_n + J_{n+1} \cdot 0 = Z_n.\]

Here: \(Z_n\) is \(\mathcal{F}_n\)-measurable (Property 1 of §1.1); \(J_{n+1}\) is \(\mathcal{F}_n\)-measurable and pulls out (Property 5); \(X_{n+1}\) is independent of \(\mathcal{F}_n\) with \(E[X_{n+1}] = 0\) (Property 3).


Property 2 — Linearity

If \(J_n, K_n\) are predictable sequences and \(a, b\) constants, then \(aJ_n + bK_n\) is predictable and:

\[\sum_{j=1}^n (aJ_j + bK_j) X_j = a \sum_{j=1}^n J_j X_j + b \sum_{j=1}^n K_j X_j.\]

Property 3 — Variance rule

\[\mathrm{Var}[Z_n] = E[Z_n^2] = \sigma^2 \sum_{j=1}^n E[J_j^2].\]

Proof:

Using orthogonality of martingale increments (§1.5), the cross terms \(E[J_j X_j \cdot J_k X_k]\) vanish for \(j \neq k\):

\[E[Z_n^2] = \sum_{j=1}^n E[J_j^2 X_j^2].\]

Since \(J_j\) is \(\mathcal{F}_{j-1}\)-measurable and \(X_j\) is independent of \(\mathcal{F}_{j-1}\):

\[E[J_j^2 X_j^2] = E\bigl[E[J_j^2 X_j^2 \mid \mathcal{F}_{j-1}]\bigr] = E\bigl[J_j^2\, E[X_j^2 \mid \mathcal{F}_{j-1}]\bigr] = E[J_j^2]\, \sigma^2.\]

Summing over \(j\) gives \(E[Z_n^2] = \sigma^2 \sum_{j=1}^n E[J_j^2]\).

Why the variance rule matters: It gives an explicit formula for the second moment of $Z_n$ purely in terms of the integrand $J_j$ and the variance $\sigma^2$ of the increments. This is the discrete analogue of the Itô isometry $E\!\left[\left(\int_0^t A_s\, dB_s\right)^2\right] = \int_0^t E[A_s^2]\, ds$, which is the cornerstone of the $L^2$ theory of stochastic integration.

Comparison: discrete stochastic integral vs. Itô integral

Feature Discrete: \(Z_n = \sum J_j X_j\) Continuous: \(\int_0^t A_s\, dB_s\)
Integrand condition \(J_n\) is \(\mathcal{F}_{n-1}\)-measurable (predictable) \(A_s\) is adapted, square-integrable
Martingale property \(Z_n\) is a martingale \(\int_0^t A_s\, dB_s\) is a martingale
Variance rule \(E[Z_n^2] = \sigma^2 \sum E[J_j^2]\) \(E\!\left[\left(\int_0^t A_s\, dB_s\right)^2\right] = \int_0^t E[A_s^2]\, ds\)
Linearity ✓ direct from summation ✓ by construction

Part 3 — A Maximal Inequality

Theorem 1.7.1 — Doob's Maximal Inequality for Submartingales Theorem

Suppose \(Y_n\) is a nonneg submartingale with respect to \(\{\mathcal{F}_n\}\), and \(\bar{Y}_n = \max\{Y_0, Y_1, \ldots, Y_n\}\). Then for every \(a > 0\),

\[P\{\bar{Y}_n \geq a\} \leq \frac{1}{a}\, E[Y_n].\]

Proof:

Let \(T = \min\{k \leq n : Y_k \geq a\}\) (with \(T = n+1\) if no such \(k\) exists). Then:

\[\{\bar{Y}_n \geq a\} = \bigsqcup_{k=0}^n A_k, \quad A_k = \{T = k\}.\]

Each \(A_k \in \mathcal{F}_k\). Since \(Y_n\) is a submartingale, \(E[Y_n \mid \mathcal{F}_k] \geq Y_k\) for \(k \leq n\), so:

\[E[Y_n \mathbf{1}_{A_k}] = E\bigl[E[Y_n \mid \mathcal{F}_k]\, \mathbf{1}_{A_k}\bigr] \geq E[Y_k\, \mathbf{1}_{A_k}] \geq a\, P(A_k).\]

Summing over \(k = 0, 1, \ldots, n\):

\[E[Y_n] \geq E\!\left[Y_n\, \mathbf{1}_{\{\bar{Y}_n \geq a\}}\right] = \sum_{k=0}^n E[Y_n\, \mathbf{1}_{A_k}] \geq a\, P\{\bar{Y}_n \geq a\}.\]

Dividing by \(a\) gives the result.

Corollary 1.7.2 — Doob's $L^2$ Maximal Inequality Corollary

If \(M_n\) is a square integrable martingale with respect to \(\{\mathcal{F}_n\}\) and \(\overline{M}_n = \max\{\lvert M_0 \rvert, \ldots, \lvert M_n \rvert\}\), then for every \(a > 0\),

\[P\{\overline{M}_n \geq a\} \leq \frac{E[M_n^2]}{a^2}.\]

Proof: Exercise 1.15 shows that if \(M_n\) is a martingale and \(\varphi\) is a convex function, then \(\varphi(M_n)\) is a submartingale. Taking \(\varphi(x) = x^2\) gives that \(M_n^2\) is a nonneg submartingale. Apply Theorem 1.7.1 to \(Y_n = M_n^2\) with threshold \(a^2\):

\[P\{\bar{Y}_n \geq a^2\} \leq \frac{E[M_n^2]}{a^2}.\]

Since \(\{\bar{Y}_n \geq a^2\} = \{\max_k M_k^2 \geq a^2\} = \{\overline{M}_n \geq a\}\), the result follows.

Why this matters: The $L^2$ maximal inequality is used throughout Chapter 3 to show that stochastic integrals defined on dyadic times extend continuously to all times (the Kolmogorov continuity argument). It is also used in Chapter 4 to prove the OST under the $L^2$ boundedness condition (Theorem 1.3.3). Controlling the running maximum by the second moment at the final time is the key tool.
⚠️Common Misconception
Wrong: "Doob's maximal inequality says $P\{\overline{M}_n \geq a\} \leq E[\lvert M_n \rvert]/a$ for any martingale."
Correct: Theorem 1.7.1 applies to nonneg submartingales, not arbitrary martingales. For a general martingale $M_n$, we apply it to the submartingale $Y_n = \lvert M_n \rvert$ (since $|\cdot|$ is convex) to get $P\{\overline{M}_n \geq a\} \leq E[\lvert M_n \rvert]/a$. The $L^2$ version in Corollary 1.7.2 applies $Y_n = M_n^2$ and replaces $a$ by $a^2$, giving the sharper $1/a^2$ bound under the stronger $L^2$ hypothesis.

Part 4 — Worked Example: Verifying the Variance Rule

Variance rule for coin-tossing random walk Example

Setup: \(X_1, X_2, \ldots\) i.i.d. with \(P\{X_j = \pm 1\} = \tfrac{1}{2}\), so \(\sigma^2 = 1\). Let \(S_n = X_1 + \cdots + X_n\).

Integrand: \(J_j = S_{j-1}\) (the running sum just before step \(j\), which is \(\mathcal{F}_{j-1}\)-measurable ✓).

Integral: \(Z_n = \sum_{j=1}^n S_{j-1} X_j.\)

Apply the variance rule:

\[E[Z_n^2] = \sigma^2 \sum_{j=1}^n E[J_j^2] = \sum_{j=1}^n E[S_{j-1}^2] = \sum_{j=1}^n (j-1) = \frac{n(n-1)}{2}.\]

Here we used \(E[S_{j-1}^2] = j - 1\) (since \(\mathrm{Var}[S_{j-1}] = j-1\) for zero-mean i.i.d. increments with \(\sigma^2 = 1\)).

Direct check with Itô’s formula analogy: The integral \(Z_n = \sum S_{j-1} X_j\) is the discrete analogue of \(\int_0^t B_s\, dB_s\), which by Itô’s formula equals \(\tfrac{1}{2}(B_t^2 - t)\). The variance of \(\tfrac{1}{2}(B_t^2 - t)\) at time \(t\) is \(\tfrac{t^2}{2}\), consistent with \(n(n-1)/2 \approx n^2/2\) for large \(n\).

Key point: The variance rule allows computing $E[Z_n^2]$ without expanding the square and tracking all cross terms — orthogonality kills them all. The only computation needed is $E[J_j^2]$ for each $j$, which is a much simpler task.

Part 5 — The Three Properties as a Unified Blueprint

All three sections prepare the same three-property package that will recur throughout Chapters 3 and 4:

Property §1.6 Discrete version Chapter 3 Continuous version
Martingale \(Z_n = \sum J_j X_j\) is a martingale \(\int_0^t A_s\, dB_s\) is a martingale
Linearity \(\sum (aJ_j + bK_j) X_j = a Z_n^J + b Z_n^K\) \(\int (aA + bC)\, dB = a\int A\, dB + b \int C\, dB\)
Variance rule (Itô isometry) \(E[Z_n^2] = \sigma^2 \sum E[J_j^2]\) \(E\!\left[\left(\int_0^t A_s\, dB_s\right)^2\right] = \int_0^t E[A_s^2]\, ds\)
Maximal inequality \(P\{\overline{M}_n \geq a\} \leq E[M_n^2]/a^2\) \(P\{\sup_{s \leq t} \lvert M_s \rvert \geq a\} \leq E[M_t^2]/a^2\)

Term Glossary

Square integrable martingale Definition
A martingale $M_n$ with $E[M_n^2] < \infty$ for each $n$. Stronger than ordinary integrability ($E[\lvert M_n \rvert] < \infty$), weaker than uniform $L^2$ boundedness ($\sup_n E[M_n^2] \leq C$). The natural domain for the Pythagorean identity and the variance rule.
$L^2(\Omega, \mathcal{F}, P)$ Definition
The Hilbert space of square-integrable random variables on $(\Omega, \mathcal{F}, P)$, with inner product $(X,Y) = E[XY]$ and norm $\lVert X \rVert_2 = \sqrt{E[X^2]}$. The conditional expectation $E[Y \mid \mathcal{F}_n]$ is the orthogonal projection of $Y$ onto the closed subspace $L^2(\Omega, \mathcal{F}_n, P)$, minimising $E[(Y-Z)^2]$ over all $\mathcal{F}_n$-measurable $Z$.
Orthogonal random variables Definition
$X$ and $Y$ are orthogonal if $E[XY] = E[X]\,E[Y]$. For zero-mean variables this reduces to $E[XY] = 0$, i.e., $(X,Y) = 0$ in $L^2$. Independent zero-mean variables are orthogonal; the converse fails. Proposition 1.5.1 shows martingale increments $\Delta M_n$ are mutually orthogonal for $n \neq m$ using only the martingale property.
Pythagorean identity for martingales Property
$E[M_n^2] = E[M_0^2] + \sum_{j=1}^n E[(\Delta M_j)^2]$ for any square integrable martingale. Follows from orthogonality of increments: expanding $M_n^2 = (M_0 + \sum \Delta M_j)^2$ and taking expectations, all cross terms $E[\Delta M_j \cdot \Delta M_k]$ for $j \neq k$ vanish. This is the discrete analogue of the Itô isometry.
Predictable process Definition
A sequence $J_1, J_2, \ldots$ where $J_n$ is $\mathcal{F}_{n-1}$-measurable for every $n$. The value of $J_n$ is known strictly before time $n$. This is the allowable betting condition from §1.2, and the exact discrete analogue of the adapted condition for Itô integrands.
Discrete stochastic integral $Z_n = \sum_{j=1}^n J_j X_j$ Definition
The sum of a predictable integrand $J_j$ against the i.i.d. increments $X_j$ of a random walk. Satisfies three properties: (1) martingale; (2) linearity; (3) variance rule $E[Z_n^2] = \sigma^2 \sum E[J_j^2]$. The discrete prototype of the Itô integral $\int_0^t A_s\, dB_s$ in Chapter 3.
Variance rule (Itô isometry, discrete form) Property
$\mathrm{Var}[Z_n] = E[Z_n^2] = \sigma^2 \sum_{j=1}^n E[J_j^2]$. Proved by using: (a) orthogonality of martingale increments to kill cross terms; (b) the pull-out property to separate $J_j^2$ from $X_j^2$; (c) independence of $X_j$ from $\mathcal{F}_{j-1}$ to replace $E[X_j^2 \mid \mathcal{F}_{j-1}]$ by $\sigma^2$.
Doob's maximal inequality Theorem
For a nonneg submartingale $Y_n$ with running maximum $\bar{Y}_n = \max_{k \leq n} Y_k$: $$P\{\bar{Y}_n \geq a\} \leq \frac{E[Y_n]}{a}.$$ Proved by partitioning $\{\bar{Y}_n \geq a\}$ into the disjoint events $A_k = \{T = k\}$ (first time $Y$ hits $a$), and using the submartingale inequality $E[Y_n \mathbf{1}_{A_k}] \geq a\,P(A_k)$ for each $k$.
Doob's $L^2$ maximal inequality Theorem
For a square integrable martingale $M_n$ with $\overline{M}_n = \max_{k \leq n} \lvert M_k \rvert$: $$P\{\overline{M}_n \geq a\} \leq \frac{E[M_n^2]}{a^2}.$$ Follows from Theorem 1.7.1 applied to the submartingale $Y_n = M_n^2$ (convexity of $x^2$ makes $M_n^2$ a submartingale). Controls the running maximum by the terminal second moment — used in the Kolmogorov continuity argument and in the proof of OST III.