Zorn's lemma

Zorn's lemma other maximal principles

Definitions. In a partial order $(P, <)$ , we say a subset $C \subseteq P$ is a chain if $C$ is totally ordered by $<$ (its elements are pairwise-comparable). An element $u$ is an upper bound of a set $A$ if $a \le u$ for all $a \in A$ . An element $m$ is maximal if there is no $y$ such that $m < y$ . A chain $C$ is maximal if there is no chain $C'$ such that $C \subset C'$ .

Zorn’s lemma. If $P \ne \emptyset$ is partially-ordered and every chain has an upper bound, then it has a maximal element.

Hausdorff maximal principle. Every partially-ordered set has a maximal chain.

Both of these are equivalent as will be seen in the proposition below. Hausdorff proved his version in 1914, while Kuratowski proved what’s now known as Zorn’s lemma in 1922, while Zorn himself proved it independently in 1935. For this reason, some places may call it the “Kuratowski-Zorn lemma”.

Definitions. A family $\mathcal{F}$ of sets has finite character when: $A \in \mathcal{F}$ if and only if $F \in \mathcal{F}$ for every finite subset $F$ of $A$ . A family of sets is always a partially-ordered set under inclusion ( $A \le B$ iff $A \subseteq B$ ). Given a family $\mathcal{F}$ partially-ordered by inclusion, it is “closed under unions of chains” if, for every chain $C \subseteq \mathcal{F}$ , $\bigcup_{X \in C} X = \bigcup C \in \mathcal{F}$ .

Zorn’s lemma for inclusions. If $\mathcal{F}$ is closed under unions of chains, then it has a maximal element.

Tukey’s lemma. If $\mathcal{F}$ has finite character, it has a maximal element (with respect to inclusion).

Proposition. The following statements are equivalent.

Hausdorff maximal principle
Zorn’s lemma
Zorn’s lemma for inclusions
Tukey’s lemma
Axiom of choice

Proof. (1) ⇒ (2) Since $P$ is nonempty, let $M$ be any maximal chain. Being a chain, it has an upper bound $u$ . Since $M \cup \{u\}$ is a chain, then $u \in M$ . There can’t be a $y > u$ since then $M \cup \{y\} \supset M$ would be a bigger chain. So $u$ is maximal.

(2) ⇒ (3) If the partial-order is inclusion, then for any given chain $C$ , we have that $X \subseteq \bigcup C$ for all $X \in C$ , which implies $\bigcup C$ is an upper bound of $C$ , so every chain has an upper bound.

(3) ⇒ (4) If $\mathcal{F}$ has finite character, we just have to show it’s closed under unions of chains. If $C \subseteq \mathcal{F}$ is a chain, let $\{x_1, \dots, x_n\} \subseteq \bigcup C$ . For each $1 \le i \le n$ , let $X_i \in C$ such that $x_i \in X_i$ . Since $\{X_1, \dots, X_n\}$ is a finite subset of a chain, one of them is a maximum element, so that $\{x_1, \dots, x_n\} \subseteq X_k$ for some $1 \le k \le n$ . But then $\{x_1, \dots, x_n\} \in \mathcal{F}$ since $X_k \in \mathcal{F}$ . This proves every finite subset of $\bigcup C$ is in $\mathcal{F}$ and so $\bigcup C \in \mathcal{F}$ .

(4) ⇒ (5) Let $X$ be any set and define $\mathcal{F}$ as the family of functions $f$ which are choice functions on their domain and $\dom(f) \subseteq X$ . A function is, set-theoretically, a set of ordered pairs $(x, f(x))$ . So any finite subset of $f \in \mathcal{F}$ must be a choice function and its domain a subset of $X$ . Conversely, if $f \notin \mathcal{F}$ , either there is either some $x \in \dom(f) \setminus X$ or $f$ is not a choice function, which means $f(x) \notin x$ for some $x$ . In both cases, $\{(x, f(x))\} \subseteq f$ is a finite subset not in $\mathcal{F}$ . This proves $\mathcal{F}$ has a finite character, and it has a maximal function $f$ . If $x \ne \emptyset$ with $x \notin \dom(f)$ , we could choose $y \in x$ and extend $f$ to $f \cup \{(x, y)\}$ which would contradict its maximality, so $\dom(f) = X \setminus \{\emptyset\}$ , and therefore, it is a choice function on $X$ .

(5) ⇒ (1) Let $\mathcal{C}$ be the set of chains of the partially-ordered set $P$ . Let $f$ be a choice function on $\mathcal{P}(P)$ . For each chain $C \in \mathcal{C}$ , define $C^* = \{ x \in P \setminus C \ : \ C \cup \{x\} \in \mathcal{C} \}$ , and

\tilde{C} = \begin{cases} C \cup \{f(C^*)\} & \ C^* \ne \emptyset \\ C & \ C^* = \emptyset \end{cases}

It follows that a $C$ is a maximal chain iff $\tilde{C} \subseteq C$ . A set $\mathcal{T} \subseteq \mathcal{C}$ is a tower when

$\emptyset \in \mathcal{T}$
If $T \subseteq \mathcal{T}$ is a chain, then $\bigcup T \in \mathcal{T}$ .
If $C \in \mathcal{T}$ , then $\tilde{C} \in \mathcal{T}$ .

Let $T_0$ be the intersection of all towers, which is well-defined, since $\mathcal{F}$ itself is a tower. It follows that $T_0$ must also be a tower. If we prove $T_0$ is a chain, then $C = \bigcup T_0 \in T_0$ by property 2, and $\tilde{C} \in T_0$ by property 3. That implies $\tilde{C} \subseteq \bigcup T_0 = C$ , so we’ll be done.

To do so, let $\Gamma$ be the set of elements which are comparable with all elements of $T_0$ . If we show $\Gamma$ is a tower, by definition of $T_0$ , we get $\Gamma = T_0$ , and that would imply $T_0$ is a chain. Clearly $\Gamma$ satisfies 1, and if every element of a chain $T$ is comparable with all of $T_0$ , for any given $y \in T_0$ , either $y \subseteq x \subseteq \bigcup T$ for some $x \in T$ , or $x \subseteq y$ for all $x \in T$ , and then $\bigcup T \subseteq y$ , so $\Gamma$ satisfies 2.

To prove $\Gamma$ satisfies 3, let $C \in \Gamma$ be fixed and define $\Gamma_C = \{ A \in T_0 \ : \ A \subseteq C \text{ or } \tilde{C} \subseteq A \}$ . By definition, $\tilde{C}$ is comparable with every element of $\Gamma_C$ . Once again, if we prove $\Gamma_C$ is a tower, then $\Gamma_C = T_0$ , meaning $\tilde{C} \in \Gamma$ , proving $\Gamma$ satisfies 3 and is a tower. Clearly $\Gamma_C$ satisfies 1, and like above, if $T \subseteq \Gamma_C$ is a chain, either $\tilde{C} \subseteq A \subseteq \bigcup T$ for some $A \in T$ , or $A \subseteq C$ for all $A \in T$ , in which $\bigcup T \subseteq C$ , and so $\Gamma_C$ satisfies 2. Finally, suppose $A \in \Gamma_C$ . If $\tilde{C} \subseteq A$ , then $\tilde{C} \subseteq \tilde{A}$ . If $A \subseteq C$ , since $C \in \Gamma$ and $\tilde{A} \in T_0$ , it must be that either $C \subseteq \tilde{A}$ or $\tilde{A} \subseteq C$ . If the latter does not happen, it must be that $A \subset C$ and $A \subseteq C \subset \tilde{A}$ . But now we use the fact that $\tilde{A} \setminus A$ is either empty or at most a singleton, which must force $A = C$ , forcing $\tilde{C} \subseteq \tilde{A}$ . So in both cases $\tilde{A} \in \Gamma_C$ and so it satisfies 3. $\square$