You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Theorem 7.2:
A hypothesis class $\mathcal{H}$ of binary classifiers is nonuniformly learnable if and only if it is a countable union of agnostic PAC learnable hypothesis classes.
Proof Proposal:
"$\Leftarrow$" as in the book.
"$\Rightarrow$"
Assume that $\mathcal{H}$ is nonuniform learnable using some algorithm $A$.
For every $n \in \mathbb{N}$ let
$ \mathcal{H}_{n} := \{ h \in \mathcal{H} : m_{\mathcal{H}}^{NUL} (1/10, 1/10, h) \leq n \} .$
It follows that $\mathcal{H} = \bigcup_{n \in \mathbb{N}} \mathcal{H}_{n}$.
We prove the "$\Rightarrow$" claim by contradiction.
I.e., assume that $\mathcal{H}_{n}$ is not APAC learnable then we find a contradiction to its definition:
Suppose $\mathcal{H}_{n}$ is not APAC learnable.
Then, by the fundamental Theorem (Thm. 6.7) $\mathcal{H}_{n}$ has infinite VC-dimension.
(Note, that this implies that the domain $\mathcal{X}$, on which $\mathcal{H}_{n}$ and $\mathcal{H}$ operate must be infinite, as well).
Thus, $\mathcal{H}_{n}$ shatters sets of size $n'$ for all $n' \in \mathbb{N}$.
$A$ is a learning algorithm for the class of binary classificators with $0-1$-loss.
Thus, for any $n'$ the No-Free-Lunch Theorem (Thm. 5.1) holds with respect to some $h' \in \mathcal{H}_{n}$.
(In particular, Thm. 5.1 does not require $A$ to output anything more specific than a function $f: \mathcal{X} \to \{ 0,1 \} $.)
That is:
$ \forall n' \in \mathbb{N} \ \exists $ a distribution $D$ that is realizable by $\mathcal{H}_{n}$ and $\exists h' \in \mathcal{H}_{n}$ s.t. with probability $ \geq 1/7$ over the choice of $ S \sim D^{n'}$ it holds that $L_{D} (A(S)) \geq 1/8 = L_{D} (h') + 1/8$.
However, by definition of $\mathcal{H}_{n}$ we have $\forall h\in \mathcal{H}_{n} : m_{\mathcal{H}}^{NUL} (1/10, 1/10, h) \leq n$.
That is, $ \forall D \ \forall h \in \mathcal{H}_{n} : L_{D} (A(S)) \leq L_{D}(h) + 1/10 $ with probability $\geq 9/10$ over the choice of $S \sim D^n$.
Considering the case $n' = n$ yields the desired contradiction.
Hence, $\mathcal{H}_{n}$ must be agnostic PAC learnable.
Comments
Looks reasonable to me. Just a small suggestion: As you argue anyway with $L_{D} (h') + 1/8$ and $L_{D}(h) + 1/10$ (and not only with 1/8 and 1/10), we could drop the realizability assumption completely and directly argue for all $D$. In my opinion, this would make the proof even a little simpler.