<?xml version="1.0"?> 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "mathml.dtd"> 
<?xml-stylesheet type="text/css" href="thesis.css"?> 
<html  
xmlns="http://www.w3.org/1999/xhtml"  
><head><title>4.2 The basic training set bound</title> 
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> 
<meta name="generator" content="TeX4ht (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html)" /> 
<meta name="originator" content="TeX4ht (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html)" /> 
<!-- 3,early_,early^,xhtml,mozilla --> 
<meta name="src" content="thesis.tex" /> 
<meta name="date" content="2002-08-28 13:56:00" /> 
<link rel="stylesheet" type="text/css" href="thesis.css" /> 
</head><body 
>
   <div class="crosslinks"><p class="noindent">[<a 
href="thesisse17.xml" >next</a>] [<a 
href="thesisse15.xml" >prev</a>] [<a 
href="thesisse15.xml#tailthesisse15.xml" >prev-tail</a>] [<a 
href="#tailthesisse16.xml">tail</a>] [<a 
href="thesisch4.xml#thesisse16.xml" >up</a>] </p></div>
   <h3 class="sectionHead"><span class="titlemark">4.2. </span> <a 
  name="x23-320004.2"></a>The basic training set bound</h3>
<!--l. 1034--><p class="noindent">The most basic of training set based sample complexity bounds is the simple
combination of a Binomial tail bound and the union bound. In particular, we
have:
</p>
   <div class="newtheorem">
<!--l. 1038--><p class="noindent"><span class="head">
<a 
  name="x23-32001r1"></a>
  <span 
class="eccc-1000">T<small 
class="small-caps">H</small><small 
class="small-caps">E</small><small 
class="small-caps">O</small><small 
class="small-caps">R</small><small 
class="small-caps">E</small><small 
class="small-caps">M</small> </span>4.2.1<span 
class="eccc-1000">.</span></span>
</p><!--l. 1039--><p class="indent">   <span 
class="ecti-1000">(Discrete Hypothesis Bound) For all hypothesis spaces, </span><!--l. 1039--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>H</mi></mrow></math><span 
class="ecti-1000">,</span>
<span 
class="ecti-1000">for all </span><!--l. 1040--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>&#x03B4;</mi> <mo 
class="MathClass-rel">&#x2208;</mo> <mrow><mo 
class="MathClass-open">(</mo><mrow><mn>0</mn><mo 
class="MathClass-punc">,</mo><mn>1</mn></mrow><mo 
class="MathClass-close">]</mo></mrow></mrow></math>
<!--l. 1041--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">    <mrow 
>
                          <msub><mrow 
><mo 
>Pr</mo></mrow><mrow 
><msup><mrow 
><mi 
>D</mi></mrow><mrow 
><mi 
>m</mi></mrow></msup 
></mrow></msub 
> <mfenced separators="" 
open="("  close=")" ><mrow><mi 
>&#x2203;</mi><mi 
>h</mi> <mo 
class="MathClass-rel">&#x2208;</mo> <mi 
>H</mi> <mo 
class="MathClass-punc">:</mo>  <mi 
>e</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">&#x2265;</mo><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo>&#x0304;</mo></mover> <mfenced separators="" 
open="("  close=")" ><mrow><mi 
>m</mi><mo 
class="MathClass-punc">,</mo><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo>&#x0302;</mo></mover><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow><mo 
class="MathClass-punc">,</mo>  <mfrac><mrow 
><mi 
>&#x03B4;</mi></mrow> 
<mrow 
><mo 
class="MathClass-rel">&#x2223;</mo><mi 
>H</mi><mo 
class="MathClass-rel">&#x2223;</mo></mrow></mfrac></mrow></mfenced></mrow></mfenced><mo 
class="MathClass-rel">&#x2264;</mo> <mi 
>&#x03B4;</mi>
</mrow></math>
<span 
class="ecti-1000">where </span><!--l. 1043--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo 
class="MathClass-op">&#x0304;</mo></mover> <mfenced separators="" 
open="("  close=")" ><mrow><mi 
>m</mi><mo 
class="MathClass-punc">,</mo> <mfrac><mrow 
><mi 
>k</mi></mrow> 
<mrow 
><mi 
>m</mi></mrow></mfrac><mo 
class="MathClass-punc">,</mo><mi 
>&#x03B4;</mi></mrow></mfenced> <mo 
class="MathClass-rel">&#x2261;</mo><msub><mrow 
><mo 
> max</mo></mrow><mrow 
><mi 
>p</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">{</mo><mrow><mi 
>p</mi> <mo 
class="MathClass-punc">:</mo>  <!--mstyle 
class="text"--><mtext class="textrm">Bin</mtext><!--/mstyle--><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>m</mi><mo 
class="MathClass-punc">,</mo><mi 
>k</mi><mo 
class="MathClass-punc">,</mo><mi 
>p</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">=</mo> <mi 
>&#x03B4;</mi></mrow><mo 
class="MathClass-close">}</mo></mrow></mrow></math>
</p>
   </div>
<!--l. 1045--><p class="indent">   Note that this theorem can only be nonvacuous when the hypothesis space, <!--l. 1045--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mi 
>H</mi></mrow></math>, has
some finite (discrete) size.
</p>
   <div class="proof">
<!--l. 1049--><p class="indent">   <span class="head">
                                                                     

                                                                     
   <span 
class="eccc-1000">P<small 
class="small-caps">R</small><small 
class="small-caps">O</small><small 
class="small-caps">O</small><small 
class="small-caps">F</small>.</span> </span>For     every     individual     hypothesis,     we     know     that:     <!--l. 1050--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">
<mrow 
>
                  <mi 
>&#x2200;</mi><mi 
>h</mi> <msub><mrow 
><mo 
>Pr</mo></mrow><mrow 
><msup><mrow 
><mi 
>D</mi></mrow><mrow 
><mi 
>m</mi></mrow></msup 
></mrow></msub 
> <mfenced separators="" 
open="("  close=")" ><mrow><mi 
>e</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">&#x2265;</mo><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo>&#x0304;</mo></mover> <mfenced separators="" 
open="("  close=")" ><mrow><mi 
>m</mi><mo 
class="MathClass-punc">,</mo><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo>&#x0302;</mo></mover><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow><mo 
class="MathClass-punc">,</mo>  <mfrac><mrow 
><mi 
>&#x03B4;</mi></mrow> 
<mrow 
><mo 
class="MathClass-rel">&#x2223;</mo><mi 
>H</mi><mo 
class="MathClass-rel">&#x2223;</mo></mrow></mfrac></mrow></mfenced></mrow></mfenced><mo 
class="MathClass-rel">&#x2264;</mo> <mfrac><mrow 
><mi 
>&#x03B4;</mi></mrow> 
<mrow 
><mo 
class="MathClass-rel">&#x2223;</mo><mi 
>H</mi><mo 
class="MathClass-rel">&#x2223;</mo></mrow></mfrac>
</mrow></math>
Applying  the  union  bound  (see  section    <a 
href="thesisse13.xml#x19-270003.5">3.5<!--tex4ht:ref: sec-union --></a>)  for  every  hypothesis  gives  us:  <!--l. 1054--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">
<mrow 
>
                <msub><mrow 
><mo 
>Pr</mo></mrow><mrow 
><msup><mrow 
><mi 
>D</mi></mrow><mrow 
><mi 
>m</mi></mrow></msup 
></mrow></msub 
> <mfenced separators="" 
open="("  close=")" ><mrow><mi 
>&#x2203;</mi><mi 
>h</mi> <mo 
class="MathClass-rel">&#x2208;</mo> <mi 
>H</mi> <mo 
class="MathClass-punc">:</mo>  <mi 
>e</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">&#x2265;</mo><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo>&#x0304;</mo></mover> <mfenced separators="" 
open="("  close=")" ><mrow><mi 
>m</mi><mo 
class="MathClass-punc">,</mo><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo>&#x0302;</mo></mover><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow><mo 
class="MathClass-punc">,</mo>  <mfrac><mrow 
><mi 
>&#x03B4;</mi></mrow> 
<mrow 
><mo 
class="MathClass-rel">&#x2223;</mo><mi 
>H</mi><mo 
class="MathClass-rel">&#x2223;</mo></mrow></mfrac></mrow></mfenced></mrow></mfenced><mo 
class="MathClass-rel">&#x2264;</mo> <mi 
>&#x03B4;</mi>
</mrow></math>
which is the result. <span class="qed"><span 
class="msam-10">&#x25AB;</span></span>
</p>
   </div>
<!--l. 1058--><p class="indent">   Intuitively, this theorem says that as the number of hypotheses grows, we can not
guarantee that the empirical error will be near to the true error.
</p><!--l. 1061--><p class="indent">   A better understanding can be gained by considering some of the Binomial tail
bound approximations. This is also worth mentioning in order to compare this theorem
with theorems in more common forms.
</p>
   <div class="newtheorem">
<!--l. 1065--><p class="noindent"><span class="head">
                                                                     

                                                                     
<a 
  name="x23-32002r2"></a>
  <span 
class="eccc-1000">C<small 
class="small-caps">O</small><small 
class="small-caps">R</small><small 
class="small-caps">O</small><small 
class="small-caps">L</small><small 
class="small-caps">L</small><small 
class="small-caps">A</small><small 
class="small-caps">R</small><small 
class="small-caps">Y</small> </span>4.2.2<span 
class="eccc-1000">.</span></span>
</p><!--l. 1066--><p class="indent">   <span 
class="ecti-1000">(Relative Entropy Discrete Hypothesis Bound) For all hypothesis spaces, </span><!--l. 1067--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>H</mi></mrow></math><span 
class="ecti-1000">,</span>
<span 
class="ecti-1000">for all </span><!--l. 1067--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>&#x03B4;</mi> <mo 
class="MathClass-rel">&#x2208;</mo> <mrow><mo 
class="MathClass-open">(</mo><mrow><mn>0</mn><mo 
class="MathClass-punc">,</mo><mn>1</mn></mrow><mo 
class="MathClass-close">]</mo></mrow></mrow></math><span 
class="ecti-1000">:</span>
<!--l. 1068--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">    <mrow 
>
                  <msub><mrow 
><mo 
>Pr</mo></mrow><mrow 
><msup><mrow 
><mi 
>D</mi></mrow><mrow 
><mi 
>m</mi></mrow></msup 
></mrow></msub 
> <mfenced separators="" 
open="("  close=")" ><mrow><mi 
>&#x2203;</mi><mi 
>h</mi> <mo 
class="MathClass-rel">&#x2208;</mo> <mi 
>H</mi> <mo 
class="MathClass-punc">:</mo>  <!--mstyle 
class="text"--><mtext class="textrm">KL</mtext><!--/mstyle--><mrow><mo 
class="MathClass-open">(</mo><mrow><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo>&#x0302;</mo></mover><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow><mo 
class="MathClass-rel">&#x2223;</mo><mo 
class="MathClass-rel">&#x2223;</mo><mi 
>e</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">&#x2265;</mo> <mfrac><mrow 
><mo 
>ln</mo><!--nolimits--><mo 
class="MathClass-rel">&#x2223;</mo><mi 
>H</mi><mo 
class="MathClass-rel">&#x2223;</mo> <mo 
class="MathClass-bin">+</mo><mo 
> ln</mo><!--nolimits--> <mfrac><mrow 
><mn>1</mn></mrow> 
<mrow 
><mi 
>&#x03B4;</mi></mrow></mfrac></mrow> 
       <mrow 
><mi 
>m</mi></mrow></mfrac>      </mrow></mfenced> <mo 
class="MathClass-rel">&#x2264;</mo> <mi 
>&#x03B4;</mi>
</mrow></math>
</p>
   </div>
   <div class="proof">
<!--l. 1073--><p class="indent">   <span class="head">
   <span 
class="eccc-1000">P<small 
class="small-caps">R</small><small 
class="small-caps">O</small><small 
class="small-caps">O</small><small 
class="small-caps">F</small>.</span> </span>Loosen (theorem   <a 
href="#x23-32001r1">4.2.1<!--tex4ht:ref: th-DHSCP --></a>) with the relative entropy Chernoff bound
(Eqn.  <a 
href="thesisse10.xml#x16-24001r1">3.2.1<!--tex4ht:ref: eq-recb --></a>), and use the inversion lemma  <a 
href="thesisse12.xml#x18-26001r1">3.4.1<!--tex4ht:ref: lem-inversion --></a>. <span class="qed"><span 
class="msam-10">&#x25AB;</span></span>
</p>
   </div>
<!--l. 1076--><p class="indent">   The form of this corollary allows us to make two more important observations:
</p><!--l. 1078--><p class="indent">
           </p><ol type="1" class="enumerate1" start="1" 
>
        <li class="enumerate"><a 
  name="x23-32004x1"></a>The &#x201C;cost&#x201D; of doubling the hypothesis space size is about one extra example.
        In other words, we can keep a constant bound on the probability of a large
        deviation with <!--l. 1081--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>c</mi><mo 
>log</mo><!--nolimits--><mo 
class="MathClass-rel">&#x2223;</mo><mi 
>H</mi><mo 
class="MathClass-rel">&#x2223;</mo></mrow></math>
        hypotheses (for any <!--l. 1081--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>c</mi></mrow></math>).
           </li>
        <li class="enumerate"><a 
  name="x23-32006x2"></a>The value of <!--l. 1082--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>&#x03B4;</mi></mrow></math>
        is not very important as <!--l. 1082--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>m</mi></mrow></math>
        grows larger.</li></ol>
                                                                     

                                                                     
<!--l. 1083--><p class="nopar"> To understand this lemma, it is helpful to consider some approximations of <!--l. 1084--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><!--mstyle 
class="text"--><mtext class="textrm">KL</mtext><!--/mstyle--><mrow><mo 
class="MathClass-open">(</mo><mrow><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo 
class="MathClass-op">&#x0302;</mo></mover><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow><mo 
class="MathClass-rel">&#x2223;</mo><mo 
class="MathClass-rel">&#x2223;</mo><mi 
>e</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>. In particular,
we have: <!--l. 1086--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">        <mrow 
>
                      <mo 
class="MathClass-rel">&#x2223;</mo><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo 
class="MathClass-op">&#x0302;</mo></mover><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-bin">&#x2212;</mo> <mi 
>e</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow><mo 
class="MathClass-rel">&#x2223;</mo><mo 
class="MathClass-rel">&#x2265;</mo><!--mstyle 
class="text"--><mtext class="textrm">KL</mtext><!--/mstyle--><mrow><mo 
class="MathClass-open">(</mo><mrow><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo 
class="MathClass-op">&#x0302;</mo></mover><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow><mo 
class="MathClass-rel">&#x2223;</mo><mo 
class="MathClass-rel">&#x2223;</mo><mi 
>e</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">&#x2265;</mo> <mn>2</mn><msup><mrow 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo 
class="MathClass-op">&#x0302;</mo></mover><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-bin">&#x2212;</mo> <mi 
>e</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow><mrow 
><mn>2</mn></mrow></msup 
>
</mrow></math>
which implies that the KL-divergence varies between an <!--l. 1088--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><msub><mrow 
><mi 
>l</mi></mrow><mrow 
><mn>1</mn></mrow></msub 
></mrow></math> and an <!--l. 1088--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><msub><mrow 
><mi 
>l</mi></mrow><mrow 
><mn>2</mn></mrow></msub 
></mrow></math>
metric.
</p><!--l. 1091--><p class="indent">   We can further loosen the last corollary with the Hoeffding approximation to get the
following commonly stated bound:
</p>
   <div class="newtheorem">
<!--l. 1094--><p class="noindent"><span class="head">
<a 
  name="x23-32007r3"></a>
  <span 
class="eccc-1000">C<small 
class="small-caps">O</small><small 
class="small-caps">R</small><small 
class="small-caps">O</small><small 
class="small-caps">L</small><small 
class="small-caps">L</small><small 
class="small-caps">A</small><small 
class="small-caps">R</small><small 
class="small-caps">Y</small> </span>4.2.3<span 
class="eccc-1000">.</span></span>
</p><!--l. 1095--><p class="indent">   <span 
class="ecti-1000">(Agnostic Discrete Hypothesis Bound) For all hypothesis spaces, </span><!--l. 1096--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>H</mi></mrow></math><span 
class="ecti-1000">,</span>
<span 
class="ecti-1000">for all </span><!--l. 1096--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>&#x03B4;</mi> <mo 
class="MathClass-rel">&#x003E;</mo> <mn>0</mn></mrow></math><span 
class="ecti-1000">,</span>
                                                                     

                                                                     
<!--l. 1097--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">    <mrow 
>
                   <msub><mrow 
><mo 
>Pr</mo></mrow><mrow 
><msup><mrow 
><mi 
>D</mi></mrow><mrow 
><mi 
>m</mi></mrow></msup 
></mrow></msub 
> <mfenced separators="" 
open="("  close=")" ><mrow><mi 
>&#x2203;</mi><mi 
>h</mi> <mo 
class="MathClass-rel">&#x2208;</mo> <mi 
>H</mi> <mo 
class="MathClass-punc">:</mo> <mi 
>e</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">&#x2265;</mo><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo>&#x0302;</mo></mover><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-bin">+</mo> <msqrt><mi 
></mi>
 <mrow><mfrac><mrow 
><mo 
> ln</mo> <!--nolimits--> <mo 
class="MathClass-rel">&#x2223;</mo><mi 
>H</mi><mo 
class="MathClass-rel">&#x2223;</mo> <mo 
class="MathClass-bin">+</mo><mo 
> ln</mo> <!--nolimits--> <mfrac> <mrow 
> <mn>1</mn></mrow> 
<mrow 
><mi 
>&#x03B4;</mi></mrow></mfrac></mrow>
      <mrow 
><mn>2</mn><mi 
>m</mi></mrow></mfrac></mrow></msqrt>      </mrow></mfenced> <mo 
class="MathClass-rel">&#x2264;</mo> <mi 
>&#x03B4;</mi>
</mrow></math>
</p>
   </div>
   <div class="proof">
<!--l. 1102--><p class="indent">   <span class="head">
   <span 
class="eccc-1000">P<small 
class="small-caps">R</small><small 
class="small-caps">O</small><small 
class="small-caps">O</small><small 
class="small-caps">F</small>.</span> </span>Loosen corollary ( <a 
href="#x23-32002r2">4.2.2<!--tex4ht:ref: th-REDHSCP --></a>) with the bound <!--l. 1102--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><!--mstyle 
class="text"--><mtext class="textrm">KL</mtext><!--/mstyle--><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>q</mi><mo 
class="MathClass-rel">&#x2223;</mo><mo 
class="MathClass-rel">&#x2223;</mo><mi 
>p</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">&#x2265;</mo> <mn>2</mn><msup><mrow 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>q</mi> <mo 
class="MathClass-bin">&#x2212;</mo> <mi 
>p</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow><mrow 
><mn>2</mn></mrow></msup 
></mrow></math>.
<span class="qed"><span 
class="msam-10">&#x25AB;</span></span>
</p>
   </div>
<!--l. 1104--><p class="indent">   The agnostic form of this bound has the advantage that it explicitly shows that we
are forcing the convergence (in high probability) of the empirical error rate to the true
error rate. Graphically, we are forcing every empirical error to be near to its true error.
This is a picture which represents the connection
</p><!--l. 1110--><p class="noindent"><img 
src="thesis3x.gif" alt="PIC" class="graphics" width="85.31874pt" height="12.045pt"  /><!--tex4ht:graphics  
name="thesis3x.gif" src="convergence.eps"  
-->
</p><!--l. 1113--><p class="indent">   Later bounds will have more complicated convergence conditions with more
complicated graphs. These more complicated bounds are necessary in order to avoid the
limitations of the holdout bound. See figure  <a 
href="thesisse55.xml#x76-1110013">12.3.3<!--tex4ht:ref: fig-simple-holdout --></a> for a comparison with the holdout
set.
</p><!--l. 1119--><p class="indent">
                                                                     

                                                                     
</p>
   <div class="crosslinks"><p class="noindent">[<a 
href="thesisse17.xml" >next</a>] [<a 
href="thesisse15.xml" >prev</a>] [<a 
href="thesisse15.xml#tailthesisse15.xml" >prev-tail</a>] [<a 
href="thesisse16.xml" >front</a>] [<a 
href="thesisch4.xml#thesisse16.xml" >up</a>] </p></div><a 
  name="tailthesisse16.xml"></a>  
</body> 
</html> 
