# One Shot Deviation Principle

Posted on April 15, 2016
Tags: game theory

The content here is reproduced from Fudenberg and Tirole’s 1991 text.

In infinite horizon games, the one-shot deviation principle is a useful tool to check whether a strategy profile $$\sigma$$ is subgame perfect. Informally, it states that we only need to check deviations $$(\sigma_i', \sigma_{-i})$$ where $$\sigma_{i}'$$ differs from $$\sigma_i$$ only at a single history.

We begin with the theorem for finite horizon games.

Theorem 1 (One-Stage-Deviation Principle for Finite Horizon Games). In a finite multi-stage agme with observed actions, strategy profile $$s$$ is subgame perfect if and only if it satisfies the one-stage-deviation condition:

there is no player $$i$$ and a history $$h^t$$ such that player $$i$$ has a profitable deviation $$s_i'$$ where $$s_i = s_i'$$ everywhere except at $$h^t$$

Proof. The “only if” condition is immediate from the definition of subgame perfection. For sufficiency, suppose that a strategy profile $$s$$ satisfies the one-stage-deviation condiiton, but is not subgame perfect. Therefore, there is some history $$h^t$$ and player $$i$$ such that

$$\label{eq:1} u_i(s_i', s_{-i}; h^t) > u_i(s; h^t) \quad \text{for some strategy s_i'.}$$

Let $$T$$ be the largest $$\tau$$ for the set

$$\label{eq:2} D(\tau) := \{ h^{\tau} : s_i'(h^{\tau}) \neq s_i(h^{\tau}) \}$$

is non-empty (i.e. $$T$$ is the last stage at where $$s_i'$$ and $$s_i$$ differs). Since $$s_i = s_i'$$ for all $$\tau \geq T+1$$, the one-stage-deviation condition implies that

$$\label{eq:3} u_i(s; h^T) \geq u_i(s_i', s_{-i}; h^T).$$

Now define $$s_i''$$ to be the strategy that is equal to $$s_i'$$ in stages $$t \leq \tau \leq T-1$$ and equals $$s_i$$ at stage $$T$$. By \eqref{eq:1} and \eqref{eq:3}

$$\label{eq:4} u_i(s_i'', s_{-i}; h^t) \geq u_i(s_i', s_{-i}; h^t) > u_i(s; h^t).$$

But $$s_i''$$ differs from $$s_i$$ only up to $$T-1$$. Repeating the process, we obtain $$s_i'''$$ satisfying $$u_i(s_i''',s_{-i};h^t)>u_i(s;h^t)$$ that differs from $$s_i$$ only up to $$T-2$$. Thus, this process eventually yields a strategy $$S_i$$ with

$$\label{eq:5} u_i(S_i, s_{-i}; h^t) > u_i(s_i, s_{-i};h^t)$$

that only differs from $$s_i$$ at time $$t$$. This contradicts the one-stage-deviation principle, and complets the proof. $$\square$$

The insight of Theorem 1 carries over for infinite horizon games: it shows that if $$s_i'$$ is a profitable deviation over $$s_i$$, it must differ from $$s_i$$ infinitely often. That is, for every $$T$$ there exist $$\tau \geq T$$ and a history $$h^{\tau}$$ such that $$s_i'(h^{\tau}) \neq s_i(h^{\tau})$$. This is made precise below.

Definition. A multi-stage game with observed actions is continuous at infinity if for each player $$i$$

$$\label{eq:6} \sup_{h, \eta} |u_i(h) - u_i(\eta)| \to 0 \quad \text{as t \to \infty}$$

where the supremum is taken over all histories $$h$$ and $$\eta$$ where $$h^t = \eta^t$$, and $$h^t$$ is the restriction of the infinite history $$h$$ to the first $$t$$ periods.

In English, this says that for two histories that agree up to the first $$t$$ periods, the difference in the utilities for those two histories becomes arbitrarily small as $$t \to \infty$$.

Remark. A multi-stage game in which payoffs in each stage are uniformly bounded and players discount at $$\delta < 1$$ is continuous at infinity.

Theorem 2 (One-Stage Deviation Principle for Infinite-Horizon Games). In an infinite-horizon multi-stage game with observed actions that is continuous at infinity, a profile $$s$$ is subgame perfect if and only if it satisfies the one-stage-deviation condition in Theorem 1.

Proof. Suppose that $$s$$ satisfies the one-stage-deviation condition but is not subgame perfect. Then there exists a history $$h^t$$ and a strategy $$s_i'$$ for which

$$\label{eq:7} \epsilon := u_i(s_i', s_{-i}; h^t) - u_i(s; h^t) > 0.$$

By continuity at infinity, there exists $$T$$ large enough such that

$$\label{eq:8} |u_i(s_i', s_{-i}; h^t) - u_i(S_i, s_{-i}; h^t)| < \epsilon/2$$

where $$S_i$$ is strategy which agrees with $$s_i'$$ up to period $$T$$ and follows strategy $$s_i$$ thereafter. Thus, $$S_i$$ is a profitable deviation for player $$i$$ at $$h^t$$ that differs from $$s_i$$ in only finitely many periods. This contradicts Theorem 1. $$\square$$