<?xml version="1.0"?> 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "mathml.dtd"> 
<?xml-stylesheet type="text/css" href="thesis.css"?> 
<html  
xmlns="http://www.w3.org/1999/xhtml"  
><head><title>11.1 Combination Possibilities</title> 
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> 
<meta name="generator" content="TeX4ht (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html)" /> 
<meta name="originator" content="TeX4ht (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html)" /> 
<!-- 3,early_,early^,xhtml,mozilla --> 
<meta name="src" content="thesis.tex" /> 
<meta name="date" content="2002-08-28 13:56:00" /> 
<link rel="stylesheet" type="text/css" href="thesis.css" /> 
</head><body 
>
   <div class="crosslinks"><p class="noindent">[<a 
href="thesisse50.xml" >next</a>] [<a 
href="#tailthesisse49.xml">tail</a>] [<a 
href="thesisch11.xml#thesisse49.xml" >up</a>] </p></div>
   <h3 class="sectionHead"><span class="titlemark">11.1. </span> <a 
  name="x68-9400011.1"></a>Combination Possibilities</h3>
<!--l. 4140--><p class="noindent">We have two forms of bound, one which uses training set errors and one which uses
holdout set errors. The obvious question to ask is: can we combine the information from
both bounds? Presumably, if we use both the training error and the test error, we should
be able to construct a better confidence interval for the location of the true error
rate.
</p><!--l. 4146--><p class="indent">   Viewed as an interactive proof of learning (see figure  <a 
href="#x68-940011">11.1.1<!--tex4ht:ref: fig-tnt-protocol --></a>), the train and
test approach will just add an extra testing phase to every training set based
bound.
</p>
   <hr class="figure" /><div align="center" class="figure" 
><table class="figure"><tr class="figure"><td class="figure" 
>
                                                                     

                                                                     
<a 
  name="x68-940011"></a>
<!--l. 4151--><p class="indent">
                                                                     

                                                                     
</p><!--l. 4151--><p class="noindent"><img 
src="thesis17x.gif" alt="PIC" class="graphics" width="678.53499pt" height="404.51125pt"  /><!--tex4ht:graphics  
name="thesis17x.gif" src="thesis-presentation/train_n_test.eps"  
-->
<br />     </p><div align="center" class="caption"><table class="caption" 
><tr valign="baseline" class="caption"><td class="id">Figure&#x00A0;11.1.1:        </td><td  
class="content"><a 
  name="x68-940011"></a>        The        train        and        test        protocol
starts       with       the       learner       committing       to       a       &#x201C;prior&#x201D;       <!--l. 4158--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mi 
>p</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>,
receiving training examples, choosing a hypothesis, and then evaluating on test
examples. The train and test approach can be composed with PAC-Bayes bounds
(theorem  <a 
href="thesisse26.xml#x39-59001r1">6.2.1<!--tex4ht:ref: th-repbb --></a>) as well which is not illustrated here.</td></tr></table></div><!--tex4ht:label?: x68-940011 -->
                                                                     

                                                                     
   </td></tr></table></div><hr class="endfigure" />
<!--l. 4162--><p class="indent">   Given a fixed hypothesis and learning problem, we know that the test error will be
Binomially distributed. Given a fixed learning algorithm and learning problem,
the training error will have a considerably more complicated distribution. We
can nonetheless, regard the training error as a fixed random variable which
has some cumulative distribution parameterized by many parameters, one of
which is the true error rate of the output hypothesis (which is itself a random
variable).
</p><!--l. 4169--><p class="indent">   How can we construct a confidence interval based upon information from both the
training and testing sets? There are several possibilities.
           </p><ol type="1" class="enumerate1" start="1" 
>
        <li class="enumerate"><a 
  name="x68-94003x1"></a>Construct an interval based upon the probability that <span 
class="ecti-1000">both </span>lower bounds
        are violated.
           </li>
        <li class="enumerate"><a 
  name="x68-94005x2"></a>Construct an interval based upon the probability that at most one of the
        lower bounds is violated.
           </li>
        <li class="enumerate"><a 
  name="x68-94007x3"></a>Something else?</li></ol>
<!--l. 4178--><p class="nopar"> Technique (1) can be seen visually by graphing training error vs test error and marking
the regions that are bounded away.
</p><!--l. 4183--><p class="noindent"><img 
src="thesis18x.gif" alt="PIC" class="graphics" width="239.89624pt" height="172.64499pt"  /><!--tex4ht:graphics  
name="thesis18x.gif" src="conf_1.eps"  
-->
</p><!--l. 4186--><p class="indent">   The essential problem with technique (1) is that the resulting true error
bound takes the <span 
class="ecti-1000">maximum </span>(minus a small amount) of the bounds based upon
both the test set and the training set. Given that we don&#x2019;t trust either bound
to always return tight information, we expect the maximum will not behave
well.
</p><!--l. 4191--><p class="indent">   Technique (2) can be seen visually in a similar way:
                                                                     

                                                                     
</p><!--l. 4194--><p class="noindent"><img 
src="thesis19x.gif" alt="PIC" class="graphics" width="239.89624pt" height="172.64499pt"  /><!--tex4ht:graphics  
name="thesis19x.gif" src="conf_2.eps"  
-->
</p><!--l. 4197--><p class="indent">   Technique (2) works moderately well. Mathematically, we can calculate the minimum
of the two error bounds and add a small amount. This approach is equivalent to taking a
union bound. While this approach allows us to combine the bounds, it does
not let us achieve an improvement over either which is intuitively possible.
Certainly, if we use two test sets, we expect to construct improved confidence
intervals.
</p><!--l. 4204--><p class="indent">   A better approach may be possible. We would like to construct a rejection region of
the following form:
</p><!--l. 4208--><p class="noindent"><img 
src="thesis20x.gif" alt="PIC" class="graphics" width="239.89624pt" height="172.64499pt"  /><!--tex4ht:graphics  
name="thesis20x.gif" src="conf_4.eps"  
-->
</p><!--l. 4211--><p class="indent">   Such a rejection region has two important properties:
           </p><ol type="1" class="enumerate1" start="1" 
>
        <li class="enumerate"><a 
  name="x68-94009x1"></a>If one bound is loose, it does not greatly harm the final true error bound.
           </li>
        <li class="enumerate"><a 
  name="x68-94011x2"></a>The final true error bound can be tighter than either individual true error
        bound.</li></ol>
<!--l. 4217--><p class="nopar"> Showing that technique (2) works is just an application of the union
bound. Given any two bounds on the true error rate, we can apportion <!--l. 4219--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mfrac><mrow 
><mi 
>&#x03B4;</mi></mrow>
<mrow 
><mn>2</mn></mrow></mfrac></mrow></math>
confidence to each bound. Then both bounds will hold with probability <!--l. 4220--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mi 
>&#x03B4;</mi></mrow></math> which
implies that the smaller of the two true error bounds holds.
</p><!--l. 4224--><p class="indent">
                                                                     

                                                                     
</p>
   <div class="crosslinks"><p class="noindent">[<a 
href="thesisse50.xml" >next</a>] [<a 
href="thesisse49.xml" >front</a>] [<a 
href="thesisch11.xml#thesisse49.xml" >up</a>] </p></div><a 
  name="tailthesisse49.xml"></a>  
</body> 
</html> 
