<?xml version="1.0"?> 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "mathml.dtd"> 
<?xml-stylesheet type="text/css" href="thesis.css"?> 
<html  
xmlns="http://www.w3.org/1999/xhtml"  
><head><title>12.2 Bound Application Details</title> 
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> 
<meta name="generator" content="TeX4ht (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html)" /> 
<meta name="originator" content="TeX4ht (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html)" /> 
<!-- 3,early_,early^,xhtml,mozilla --> 
<meta name="src" content="thesis.tex" /> 
<meta name="date" content="2002-08-28 13:56:00" /> 
<link rel="stylesheet" type="text/css" href="thesis.css" /> 
</head><body 
>
   <div class="crosslinks"><p class="noindent">[<a 
href="thesisse55.xml" >next</a>] [<a 
href="thesisse53.xml" >prev</a>] [<a 
href="thesisse53.xml#tailthesisse53.xml" >prev-tail</a>] [<a 
href="#tailthesisse54.xml">tail</a>] [<a 
href="thesisch12.xml#thesisse54.xml" >up</a>] </p></div>
   <h3 class="sectionHead"><span class="titlemark">12.2. </span> <a 
  name="x75-10400012.2"></a>Bound Application Details</h3>
<!--l. 4593--><p class="noindent">Before introducing the results, we will mention a few important details about this
bound.
</p>
   <h4 class="subsectionHead"><span class="titlemark">12.2.1. </span> <a 
  name="x75-10500012.2.1"></a>Structural Risk Minimization</h4>
<!--l. 4599--><p class="noindent">A naive application of the Shell bound would not prove useful because the size of the hypothesis
space can be extremely large. Instead, we must combine it with Structural Risk Minimization
(SRM) to achieve useful results. In SRM, you start with a bound for each hypothesis space, <!--l. 4602--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><msub><mrow 
><mi 
>H</mi></mrow><mrow 
><mi 
>i</mi></mrow></msub 
></mrow></math>, in a sequence of nested
hypothesis spaces, <!--l. 4603--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">       <mrow 
><msub><mrow 
><mi 
>H</mi></mrow><mrow 
><mn>1</mn></mrow></msub 
> <mo 
class="MathClass-rel">&#x2286;</mo><mo 
class="MathClass-rel">&#x22EF;</mo><mo 
class="MathClass-rel">&#x2286;</mo> <msub><mrow 
><mi 
>H</mi></mrow><mrow 
><mi 
>n</mi></mrow></msub 
></mrow></math>.
These bounds on individual hypothesis spaces are combined to create a bound
which applies for all hypothesis spaces. The particular nesting we use is &#x201C;<!--l. 4605--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><msub><mrow 
><mi 
>H</mi></mrow><mrow 
><mi 
>i</mi></mrow></msub 
> <mo 
class="MathClass-rel">=</mo></mrow></math> all decision
trees with <!--l. 4606--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">        <mrow 
><mi 
>i</mi></mrow></math>
or fewer internal nodes&#x201D;.
</p><!--l. 4608--><p class="indent">   Since the size of the decision tree hypothesis spaces increases exponentially with the index <!--l. 4609--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mi 
>i</mi></mrow></math> , we choose <!--l. 4609--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><msub><mrow 
><mi 
>p</mi></mrow><mrow 
><mi 
>i</mi></mrow></msub 
> <mo 
class="MathClass-rel">=</mo>  <mfrac><mrow 
><mn>1</mn></mrow> 
<mrow 
><msup><mrow 
><mn>2</mn></mrow><mrow 
><mi 
>i</mi></mrow></msup 
></mrow></mfrac></mrow></math>. This choice has the property
that <!--l. 4610--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">       <mrow 
> <mfrac><mrow 
><mn>1</mn></mrow>
<mrow 
><msub><mrow 
><mi 
>p</mi></mrow><mrow 
><mi 
>i</mi></mrow></msub 
></mrow></mfrac></mrow></math> is always small
in comparison to <!--l. 4611--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">       <mrow 
><mo 
class="MathClass-rel">&#x2223;</mo><msub><mrow 
><mi 
>H</mi></mrow><mrow 
><mi 
>i</mi></mrow></msub 
><mo 
class="MathClass-rel">&#x2223;</mo></mrow></math>,
implying that the SRM bound is never much worse than a simple application of the
underlying bound.
</p>
   <h4 class="subsectionHead"><span class="titlemark">12.2.2. </span> <a 
  name="x75-10600012.2.2"></a>Computation</h4>
<!--l. 4617--><p class="noindent">The computational cost of calculating some of these bounds is nontrivial. There are two
basic parts to this computation;
</p><!--l. 4620--><p class="indent">
           </p><ol type="1" class="enumerate1" start="1" 
>
        <li class="enumerate"><a 
  name="x75-106002x1"></a>Gathering  the  information  in  order  to  calculate  the  bound.  This  is
        sometimes infeasible for shell bounds and (later) PAC-Bayes bounds.
           </li>
        <li class="enumerate"><a 
  name="x75-106004x2"></a>Combining the information in order to calculate the bound. For the shell
        bound, the amount of computation is like <!--l. 4624--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>O</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><msup><mrow 
><mi 
>m</mi></mrow><mrow 
><mn>1</mn><mo 
class="MathClass-punc">.</mo><mn>5</mn></mrow></msup 
></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>
        where <!--l. 4624--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>m</mi></mrow></math>
                                                                     

                                                                     
        is the number of training examples. For the combined test and shell bound,
        the computation is approximately <!--l. 4626--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>O</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><msup><mrow 
><mi 
>m</mi></mrow><mrow 
><mn>2</mn></mrow></msup 
></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>
        .</li></ol>
<!--l. 4627--><p class="nopar"> We typically avoid the difficulties inherent in (1) using tricks such as monte carlo
sampling followed by bounding the deviation of the monte carlo sample. In particular, we
use the anytime computation trick of section  <a 
href="thesisse53.xml#x74-10300012.1.3">12.1.3<!--tex4ht:ref: sec-fast-sampling --></a> here.
</p><!--l. 4633--><p class="indent">   To avoid difficulties inherent in problem (2), we use fast bounds (see section  <a 
href="thesisse10.xml#x16-240003.2">3.2<!--tex4ht:ref: sec-approximation --></a>) on
the Binomial tail as necessary.
</p><!--l. 4637--><p class="indent">
                                                                     

                                                                     
</p>
   <div class="crosslinks"><p class="noindent">[<a 
href="thesisse55.xml" >next</a>] [<a 
href="thesisse53.xml" >prev</a>] [<a 
href="thesisse53.xml#tailthesisse53.xml" >prev-tail</a>] [<a 
href="thesisse54.xml" >front</a>] [<a 
href="thesisch12.xml#thesisse54.xml" >up</a>] </p></div><a 
  name="tailthesisse54.xml"></a>   
</body> 
</html> 
