<?xml version="1.0"?> 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "mathml.dtd"> 
<?xml-stylesheet type="text/css" href="thesis.css"?> 
<html  
xmlns="http://www.w3.org/1999/xhtml"  
><head><title>7.4 Methods for tightening</title> 
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> 
<meta name="generator" content="TeX4ht (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html)" /> 
<meta name="originator" content="TeX4ht (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html)" /> 
<!-- 3,early_,early^,xhtml,mozilla --> 
<meta name="src" content="thesis.tex" /> 
<meta name="date" content="2002-08-28 13:56:00" /> 
<link rel="stylesheet" type="text/css" href="thesis.css" /> 
</head><body 
>
   <div class="crosslinks"><p class="noindent">[<a 
href="thesisse33.xml" >next</a>] [<a 
href="thesisse31.xml" >prev</a>] [<a 
href="thesisse31.xml#tailthesisse31.xml" >prev-tail</a>] [<a 
href="#tailthesisse32.xml">tail</a>] [<a 
href="thesisch7.xml#thesisse32.xml" >up</a>] </p></div>
   <h3 class="sectionHead"><span class="titlemark">7.4. </span> <a 
  name="x47-700007.4"></a>Methods for tightening</h3>
<!--l. 2918--><p class="noindent">The previous section showed a bound in asymptotic form which is good
for understanding the trade-offs between the number of examples (<!--l. 2919--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mi 
>m</mi></mrow></math>), the size of the hypothesis
space (<!--l. 2920--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">       <mrow 
><mo 
class="MathClass-rel">&#x2223;</mo><mi 
>H</mi><mo 
class="MathClass-rel">&#x2223;</mo></mrow></math>), the margin (<!--l. 2920--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mi 
>&#x03B8;</mi></mrow></math>) and the entropy of
the average (<!--l. 2921--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">        <mrow 
><mi 
>H</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>Q</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>).
However, it is not a good form for those interested in quantitative application of the
bound to specific problems. We state improvements which aid in the development of a
quantitatively applicable bound. We can tighten the bound above through several
techniques:
</p><!--l. 2926--><p class="indent">
           </p><ol type="1" class="enumerate1" start="1" 
>
        <li class="enumerate"><a 
  name="x47-70002x1"></a>Parameterizing  and  then  optimizing  the  parameterization  of  arbitrary
        choices within the proof. In the (improved) margin bound proof, we arbitrarily
        decided to work with the margin of the randomly produced function <!--l. 2929--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
        <mrow 
><mi 
>g</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>x</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>
        at <!--l. 2929--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mfrac><mrow 
><mi 
>&#x03B8;</mi></mrow>
<mrow 
><mn>2</mn></mrow></mfrac></mrow></math>.
        This is a good heuristic, but not the optimal choice when we use the
        improved tail bounds. Since the decision of the margin for the random
        function <!--l. 2931--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>g</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>x</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>
        is a parameter of the proof, we are free to optimize it.
           </li>
        <li class="enumerate"><a 
  name="x47-70004x2"></a>Tighter argument within the proof. The optimal value of <!--l. 2933--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>N</mi></mrow></math>
        is a function of <!--l. 2934--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>&#x03B8;</mi><mo 
class="MathClass-punc">,</mo><mi 
>m</mi><mo 
class="MathClass-punc">,</mo><!--mstyle 
class="text"--><mtext class="textrm">KL</mtext><!--/mstyle--><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>q</mi><mo 
class="MathClass-rel">&#x2223;</mo><mo 
class="MathClass-rel">&#x2223;</mo><mi 
>p</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>
        and <!--l. 2934--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>&#x03B4;</mi></mrow></math>.
        All of these are known in advance except for <!--l. 2935--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><!--mstyle 
class="text"--><mtext class="textrm">KL</mtext><!--/mstyle--><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>q</mi><mo 
class="MathClass-rel">&#x2223;</mo><mo 
class="MathClass-rel">&#x2223;</mo><mi 
>p</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>.
        If we can estimate in advance the value of <!--l. 2936--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><!--mstyle 
class="text"--><mtext class="textrm">KL</mtext><!--/mstyle--><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>q</mi><mo 
class="MathClass-rel">&#x2223;</mo><mo 
class="MathClass-rel">&#x2223;</mo><mi 
>p</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>,
        then it becomes possible to optimize the value of <!--l. 2937--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>N</mi></mrow></math>
        in a data-independent manner. Consequently, it becomes unnecessary to
        split our confidence over the possible values of <!--l. 2938--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>N</mi></mrow></math>
        and we need only split the confidence over the values of <!--l. 2939--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>&#x03B8;</mi></mrow></math>
        in proving the bound. The effect of this improvement is reducing <!--l. 2940--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mn>1</mn><mo 
class="MathClass-bin">/</mo><msub><mrow 
><mi 
>&#x03B4;</mi></mrow><mrow 
><mi 
>N</mi><mo 
class="MathClass-punc">,</mo><mi 
>k</mi></mrow></msub 
> <mo 
class="MathClass-rel">=</mo> <mn>1</mn><mo 
class="MathClass-bin">/</mo><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>N</mi><msup><mrow 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>N</mi> <mo 
class="MathClass-bin">+</mo> <mn>1</mn></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow><mrow 
><mn>2</mn></mrow></msup 
></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>
        to <!--l. 2941--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mn>1</mn><mo 
class="MathClass-bin">/</mo><msub><mrow 
><mi 
>&#x03B4;</mi></mrow><mrow 
><mi 
>N</mi><mo 
class="MathClass-punc">,</mo><mi 
>k</mi></mrow></msub 
> <mo 
class="MathClass-rel">=</mo> <mn>1</mn><mo 
class="MathClass-bin">/</mo><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>N</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>N</mi> <mo 
class="MathClass-bin">+</mo> <mn>1</mn></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>
        giving us a small improvement in the low order terms of the improved
        averaging bound.</li></ol>
<!--l. 2943--><p class="nopar">
                                                                     

                                                                     
</p><!--l. 2945--><p class="indent">
                                                                     

                                                                     
</p>
   <div class="crosslinks"><p class="noindent">[<a 
href="thesisse33.xml" >next</a>] [<a 
href="thesisse31.xml" >prev</a>] [<a 
href="thesisse31.xml#tailthesisse31.xml" >prev-tail</a>] [<a 
href="thesisse32.xml" >front</a>] [<a 
href="thesisch7.xml#thesisse32.xml" >up</a>] </p></div><a 
  name="tailthesisse32.xml"></a>  
</body> 
</html> 
