<?xml version="1.0"?> 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "mathml.dtd"> 
<?xml-stylesheet type="text/css" href="thesis.css"?> 
<html  
xmlns="http://www.w3.org/1999/xhtml"  
><head><title>9.4 Covering number calculations</title> 
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> 
<meta name="generator" content="TeX4ht (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html)" /> 
<meta name="originator" content="TeX4ht (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html)" /> 
<!-- 3,early_,early^,xhtml,mozilla --> 
<meta name="src" content="thesis.tex" /> 
<meta name="date" content="2002-08-28 13:56:00" /> 
<link rel="stylesheet" type="text/css" href="thesis.css" /> 
</head><body 
>
   <div class="crosslinks"><p class="noindent">[<a 
href="thesisse43.xml" >next</a>] [<a 
href="thesisse41.xml" >prev</a>] [<a 
href="thesisse41.xml#tailthesisse41.xml" >prev-tail</a>] [<a 
href="#tailthesisse42.xml">tail</a>] [<a 
href="thesisch9.xml#thesisse42.xml" >up</a>] </p></div>
   <h3 class="sectionHead"><span class="titlemark">9.4. </span> <a 
  name="x59-840009.4"></a>Covering number calculations</h3>
<!--l. 3675--><p class="noindent">It is important to demonstrate that this covering number is feasible to calculate
and gives a better answer than the traditional approach. We will do this by
first calculating the bracketing covering number for a very simple continuous
classifier and then comparing the results with the traditional covering number
approach.
</p><!--l. 3681--><p class="indent">   Bracketing covering numbers have already been proved for many function classes
<span class="cite">[<a 
href="thesisli2.xml#XDudley"><span 
class="ecbx-1000">13</span></a>]</span><span class="cite">[<a 
href="thesisli2.xml#XVW"><span 
class="ecbx-1000">49</span></a>]</span>. Here, we will present a proof for the simplest of continuous hypothesis spaces:
the step function on a line segment. Each hypothesis will be indexed by a number <!--l. 3684--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mi 
>a</mi> <mo 
class="MathClass-rel">&#x2208;</mo> <mrow><mo 
class="MathClass-open">[</mo><mrow><mn>0</mn><mo 
class="MathClass-punc">,</mo><mn>1</mn></mrow><mo 
class="MathClass-close">]</mo></mrow></mrow></math> according to: <!--l. 3685--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">
<mrow 
>
                          <msub><mrow 
><mi 
>h</mi></mrow><mrow 
><mi 
>a</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>x</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">=</mo> <!--mstyle 
class="text"--><mtext class="textrm">sign</mtext><!--/mstyle--><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>x</mi> <mo 
class="MathClass-bin">&#x2212;</mo> <mi 
>a</mi></mrow><mo 
class="MathClass-close">)</mo></mrow>
</mrow></math> What is <!--l. 3687--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><msub><mrow 
><mi 
>N</mi></mrow><mrow 
><mrow><mo 
class="MathClass-open">[</mo><mrow></mrow><mo 
class="MathClass-close">]</mo></mrow></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>H</mi><mo 
class="MathClass-punc">,</mo><mi 
>&#x03B3;</mi><mo 
class="MathClass-punc">,</mo><msub><mrow 
><mi 
>d</mi></mrow><mrow 
><mi 
>D</mi></mrow></msub 
></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math> for
this hypothesis set?
</p>
   <div class="newtheorem">
<!--l. 3689--><p class="noindent"><span class="head">
<a 
  name="x59-84001r1"></a>
  <span 
class="eccc-1000">T<small 
class="small-caps">H</small><small 
class="small-caps">E</small><small 
class="small-caps">O</small><small 
class="small-caps">R</small><small 
class="small-caps">E</small><small 
class="small-caps">M</small> </span>9.4.1<span 
class="eccc-1000">.</span></span>
</p><!--l. 3690--><p class="indent">   <span 
class="ecti-1000">Assume                                         that                                         </span><!--l. 3690--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mi 
>D</mi></mrow></math>
<span 
class="ecti-1000">can be described by a probability density function, then:</span>
</p><!--l. 3692--><p class="indent">   <!--l. 3692--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">
                                                                     

                                                                     
<mrow 
>
                         <msub><mrow 
><mi 
>N</mi></mrow><mrow 
><mrow><mo 
class="MathClass-open">[</mo><mrow></mrow><mo 
class="MathClass-close">]</mo></mrow></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>H</mi><mo 
class="MathClass-punc">,</mo><mi 
>&#x03B3;</mi><mo 
class="MathClass-punc">,</mo><msub><mrow 
><mi 
>d</mi></mrow><mrow 
><mi 
>D</mi></mrow></msub 
></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">&#x2264;</mo> <mfrac><mrow 
><mn>1</mn></mrow> 
<mrow 
><mi 
>&#x03B3;</mi></mrow></mfrac> <mo 
class="MathClass-bin">+</mo> <mn>1</mn>
</mrow></math>
</p>
   </div>
   <div class="proof">
<!--l. 3697--><p class="indent">   <span class="head">
   <span 
class="eccc-1000">P<small 
class="small-caps">R</small><small 
class="small-caps">O</small><small 
class="small-caps">O</small><small 
class="small-caps">F</small>.</span> </span>Consider a range of hypotheses from <!--l. 3697--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><msub><mrow 
><mi 
>h</mi></mrow><mrow 
><mi 
>a</mi></mrow></msub 
></mrow></math>
to <!--l. 3697--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><msub><mrow 
><mi 
>h</mi></mrow><mrow 
><mi 
>b</mi></mrow></msub 
></mrow></math>.
For this range of hypotheses, we can choose a bracketing pair, <!--l. 3698--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>f</mi><mo 
class="MathClass-punc">,</mo><msup><mrow 
><mi 
>f</mi></mrow><mrow 
><mi 
>&#x2032;</mi></mrow></msup 
></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>.
In particular, we can choose <!--l. 3699--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>f</mi></mrow></math>
and <!--l. 3699--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><msup><mrow 
><mi 
>f</mi></mrow><mrow 
><mi 
>&#x2032;</mi></mrow></msup 
></mrow></math>
which agree on <!--l. 3699--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mrow><mo 
class="MathClass-open">[</mo><mrow><mn>0</mn><mo 
class="MathClass-punc">,</mo><mi 
>a</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>
and <!--l. 3699--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mrow><mo 
class="MathClass-open">[</mo><mrow><mi 
>b</mi><mo 
class="MathClass-punc">,</mo><mn>1</mn></mrow><mo 
class="MathClass-close">]</mo></mrow></mrow></math>
and always predict either incorrectly (<!--l. 3700--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>f</mi></mrow></math>)
or correctly (<!--l. 3700--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><msup><mrow 
><mi 
>f</mi></mrow><mrow 
><mi 
>&#x2032;</mi></mrow></msup 
></mrow></math>)
on <!--l. 3701--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mrow><mo 
class="MathClass-open">[</mo><mrow><mi 
>a</mi><mo 
class="MathClass-punc">,</mo><mi 
>b</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>.
The distance between these functions satisfies: <!--l. 3702--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">      <mrow 
>
                                    <msub><mrow 
><mi 
>d</mi></mrow><mrow 
><mi 
>D</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>f</mi><mo 
class="MathClass-punc">,</mo><msup><mrow 
><mi 
>f</mi></mrow><mrow 
><mi 
>&#x2032;</mi></mrow></msup 
></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">=</mo><msub><mrow 
><mo 
> Pr</mo></mrow><mrow 
>
<mi 
>D</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>x</mi> <mo 
class="MathClass-rel">&#x2208;</mo> <mrow><mo 
class="MathClass-open">[</mo><mrow><mi 
>a</mi><mo 
class="MathClass-punc">,</mo><mi 
>b</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow><mo 
class="MathClass-close">)</mo></mrow>
</mrow></math>
and every hypothesis <!--l. 3704--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><msub><mrow 
><mi 
>h</mi></mrow><mrow 
><mi 
>c</mi></mrow></msub 
></mrow></math>
in <!--l. 3704--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mrow><mo 
class="MathClass-open">[</mo><mrow><mi 
>a</mi><mo 
class="MathClass-punc">,</mo><mi 
>b</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>
satisfies <!--l. 3704--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>&#x2200;</mi><mi 
>z</mi> <mi 
>f</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>z</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">&#x2265;</mo> <msub><mrow 
><mi 
>h</mi></mrow><mrow 
><mi 
>c</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>z</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">&#x2265;</mo> <msup><mrow 
><mi 
>f</mi></mrow><mrow 
><mi 
>&#x2032;</mi></mrow></msup 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>z</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>.
Consequently, if <!--l. 3705--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>a</mi></mrow></math>
and <!--l. 3705--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>b</mi></mrow></math>
                                                                     

                                                                     
are chosen appropriately, we will observe <!--l. 3706--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><msub><mrow 
><mi 
>d</mi></mrow><mrow 
><mi 
>D</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>f</mi><mo 
class="MathClass-punc">,</mo><msup><mrow 
><mi 
>f</mi></mrow><mrow 
><mi 
>&#x2032;</mi></mrow></msup 
></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">&#x2264;</mo> <mi 
>&#x03B3;</mi></mrow></math>.
</p><!--l. 3708--><p class="indent">   If <!--l. 3708--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>D</mi></mrow></math>
can be described by a probability distribution, then we can simply calculate the
marginal distribution, <!--l. 3709--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><msub><mrow 
><mi 
>D</mi></mrow><mrow 
><mi 
>x</mi></mrow></msub 
></mrow></math>,
and the cumulative distribution of the margin, <!--l. 3710--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><msub><mrow 
><mi 
>F</mi></mrow><mrow 
><mi 
>D</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>x</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>.
Now, find <!--l. 3710--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><msub><mrow 
><mi 
>a</mi></mrow><mrow 
><mi 
>i</mi></mrow></msub 
></mrow></math>
for which <!--l. 3710--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><msub><mrow 
><mi 
>F</mi></mrow><mrow 
><mi 
>D</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><msub><mrow 
><mi 
>a</mi></mrow><mrow 
><mi 
>i</mi></mrow></msub 
></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">=</mo> <mfrac><mrow 
><mi 
>i</mi></mrow> 
<mrow 
><mi 
>&#x03B3;</mi></mrow></mfrac></mrow></math>
for <!--l. 3711--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>i</mi> <mo 
class="MathClass-rel">&#x003C;</mo> <mfrac><mrow 
><mn>1</mn></mrow> 
<mrow 
><mi 
>&#x03B3;</mi></mrow></mfrac></mrow></math>.
Choose <!--l. 3711--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><msub><mrow 
><mi 
>b</mi></mrow><mrow 
><mi 
>i</mi></mrow></msub 
> <mo 
class="MathClass-rel">=</mo> <msub><mrow 
><mi 
>a</mi></mrow><mrow 
><mi 
>i</mi><mo 
class="MathClass-bin">+</mo><mn>1</mn></mrow></msub 
></mrow></math>.
There are at most <!--l. 3712--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mfrac><mrow 
><mn>1</mn></mrow>
<mrow 
><mi 
>&#x03B3;</mi></mrow></mfrac> <mo 
class="MathClass-bin">+</mo> <mn>1</mn></mrow></math>
intervals, each with a measure (according to <!--l. 3712--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>D</mi></mrow></math>)
of at most <!--l. 3713--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>&#x03B3;</mi></mrow></math>.
Consequently, we can cover <!--l. 3713--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>H</mi></mrow></math>
with <!--l. 3713--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mfrac><mrow 
><mn>1</mn></mrow>
<mrow 
><mi 
>&#x03B3;</mi></mrow></mfrac> <mo 
class="MathClass-bin">+</mo> <mn>1</mn></mrow></math>
pairs of <!--l. 3714--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mrow><mo 
class="MathClass-open">(</mo><mrow><msub><mrow 
><mi 
>f</mi></mrow><mrow 
><mi 
>i</mi></mrow></msub 
><mo 
class="MathClass-punc">,</mo><msubsup><mrow 
><mi 
>f</mi></mrow><mrow 
><mi 
>i</mi></mrow><mrow 
><mi 
>&#x2032;</mi></mrow></msubsup 
></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>.
<span class="qed"><span 
class="msam-10">&#x25AB;</span></span>
</p>
   </div>
<!--l. 3716--><p class="indent">   Given that the bracketing cover is <!--l. 3716--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mfrac><mrow 
><mn>1</mn></mrow>
<mrow 
><mi 
>&#x03B3;</mi></mrow></mfrac> <mo 
class="MathClass-bin">+</mo> <mn>1</mn></mrow></math>, we can use
theorem  <a 
href="thesisse41.xml#x58-83001r1">9.3.1<!--tex4ht:ref: th-bracketing_cover --></a> to define a constraint that the true error rate must satisfy with high probability.
Setting <!--l. 3718--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">        <mrow 
><mi 
>&#x03B3;</mi> <mo 
class="MathClass-rel">=</mo>   <mfrac><mrow 
><mn>1</mn></mrow> 
<mrow 
><mi 
>m</mi><mo 
class="MathClass-bin">&#x2212;</mo><mn>1</mn></mrow></mfrac></mrow></math>,
we get: <!--l. 3719--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">        <mrow 
>
                                <mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo 
class="MathClass-op">&#x0302;</mo></mover><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>f</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">&#x2264;</mo><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo 
class="MathClass-op">&#x0302;</mo></mover><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-bin">+</mo> <mi 
>b</mi> <mfenced separators="" 
open="("  close=")" ><mrow><mi 
>m</mi><mo 
class="MathClass-punc">,</mo>   <mfrac><mrow 
><mn>1</mn></mrow> 
<mrow 
><mi 
>m</mi> <mo 
class="MathClass-bin">&#x2212;</mo> <mn>1</mn></mrow></mfrac><mo 
class="MathClass-punc">,</mo>  <mfrac><mrow 
><mi 
>&#x03B4;</mi></mrow> 
<mrow 
><mn>2</mn><mi 
>m</mi></mrow></mfrac></mrow></mfenced>
</mrow></math>
                                                                     

                                                                     
and <!--l. 3722--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">        <mrow 
>
                                      <mi 
>e</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">&#x2264;</mo><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo 
class="MathClass-op">&#x0304;</mo></mover> <mfenced separators="" 
open="("  close=")" ><mrow><mi 
>m</mi><mo 
class="MathClass-punc">,</mo><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo 
class="MathClass-op">&#x0302;</mo></mover><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>f</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow><mo 
class="MathClass-close">)</mo></mrow><mo 
class="MathClass-punc">,</mo>  <mfrac><mrow 
><mi 
>&#x03B4;</mi></mrow> 
<mrow 
><mn>2</mn><mi 
>m</mi></mrow></mfrac></mrow></mfenced>
</mrow></math>
To be fair in comparison to the standard covering number approaches, we
should relax our theorem to use the Hoeffding approximation. Note that this
is a bit unfair because the first inequality is (inherently) a highly biased
Binomial with lower variance. Relaxing to the Hoeffding bound, we get: <!--l. 3728--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">
<mrow 
>
                   <mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo 
class="MathClass-op">&#x0302;</mo></mover><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>f</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">&#x2264;</mo><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo 
class="MathClass-op">&#x0302;</mo></mover><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-bin">+</mo>    <mfrac><mrow 
><mn>1</mn></mrow> 
<mrow 
><mi 
>m</mi> <mo 
class="MathClass-bin">&#x2212;</mo> <mn>1</mn></mrow></mfrac> <mo 
class="MathClass-bin">+</mo> <msqrt><mi 
></mi>
 <mrow><mfrac><mrow 
><mo 
> ln</mo> <!--nolimits--> <mfrac> <mrow 
> <mn>2</mn><mi 
>m</mi></mrow> 
 <mrow 
><mi 
>&#x03B4;</mi></mrow></mfrac> </mrow>
  <mrow 
><mn>2</mn><mi 
>m</mi></mrow></mfrac></mrow></msqrt>
</mrow></math> and <!--l. 3731--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">
<mrow 
>
                       <mi 
>e</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">&#x2264;</mo><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo 
class="MathClass-op">&#x0302;</mo></mover><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>f</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-bin">+</mo> <msqrt><mi 
></mi>
 <mrow><mfrac><mrow 
><mo 
> ln</mo> <!--nolimits--> <mfrac> <mrow 
> <mn>2</mn><mi 
>m</mi></mrow> 
 <mrow 
><mi 
>&#x03B4;</mi></mrow></mfrac> </mrow>
  <mrow 
><mn>2</mn><mi 
>m</mi></mrow></mfrac></mrow></msqrt>
</mrow></math> which
                                                                     

                                                                     
implies: <!--l. 3734--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">        <mrow 
>
                            <mi 
>e</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">&#x2264;</mo><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo 
class="MathClass-op">&#x0302;</mo></mover><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-bin">+</mo>    <mfrac><mrow 
><mn>1</mn></mrow> 
<mrow 
><mi 
>m</mi> <mo 
class="MathClass-bin">&#x2212;</mo> <mn>1</mn></mrow></mfrac> <mo 
class="MathClass-bin">+</mo> <mn>2</mn><msqrt><mi 
></mi>
 <mrow><mfrac><mrow 
><mo 
> ln</mo> <!--nolimits--> <mn>2</mn><mi 
>m</mi> <mo 
class="MathClass-bin">+</mo><mo 
> ln</mo> <!--nolimits--> <mfrac> <mrow 
> <mn>1</mn></mrow> 
<mrow 
><mi 
>&#x03B4;</mi></mrow></mfrac></mrow>
      <mrow 
><mn>2</mn><mi 
>m</mi></mrow></mfrac></mrow></msqrt>
</mrow></math>
Note again that we are being &#x201C;unfair&#x201D; to the new approach by using the
Hoeffding approximation rather than exact Binomial-tail bounds. The
standard covering number approach has not yet been reduced to exact
Binomial-tail bounds. Using the standard approach, the covering number, we get <!--l. 3739--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mi 
>C</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>H</mi><mo 
class="MathClass-punc">,</mo>  <mfrac><mrow 
><mn>1</mn></mrow> 
<mrow 
><mi 
>m</mi><mo 
class="MathClass-bin">&#x2212;</mo><mn>1</mn></mrow></mfrac></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">=</mo> <mi 
>m</mi></mrow></math>. This implies a
bound of: <!--l. 3741--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">        <mrow 
>
                                  <mi 
>e</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">&#x2264;</mo><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo 
class="MathClass-op">&#x0302;</mo></mover><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-bin">+</mo> <mn>4</mn><msqrt><mi 
></mi>
 <mrow><mfrac><mrow 
><mo 
> ln</mo> <!--nolimits--> <mn>8</mn><mi 
>m</mi> <mo 
class="MathClass-bin">+</mo><mo 
> ln</mo> <!--nolimits--> <mfrac> <mrow 
> <mn>1</mn></mrow> 
<mrow 
><mi 
>&#x03B4;</mi></mrow></mfrac></mrow>
       <mrow 
><mi 
>m</mi></mrow></mfrac></mrow></msqrt>
</mrow></math>
Comparing the bounds, we see that the new approach is about <!--l. 3743--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mfrac><mrow 
><mn>1</mn><mn>6</mn></mrow>
 <mrow 
><mn>2</mn></mrow></mfrac> <mo 
class="MathClass-rel">=</mo> <mn>8</mn></mrow></math> times
more efficient in the number of examples required to achieve a bound on a given
deviation.
</p>
   <h4 class="subsectionHead"><span class="titlemark">9.4.1. </span> <a 
  name="x59-850009.4.1"></a>Note on the Bracketing Cover proof</h4>
<!--l. 3750--><p class="noindent">There are several important things to note about this proof.
</p><!--l. 3752--><p class="indent">
           </p><ol type="1" class="enumerate1" start="1" 
>
        <li class="enumerate"><a 
  name="x59-85002x1"></a>We used the property that a small change in the hypothesis only affected
        the prediction on a small portion of the input space.
           </li>
        <li class="enumerate"><a 
  name="x59-85004x2"></a>The bound on <!--l. 3755--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><msub><mrow 
><mi 
>N</mi></mrow><mrow 
><mrow><mo 
class="MathClass-open">[</mo><mrow></mrow><mo 
class="MathClass-close">]</mo></mrow></mrow></msub 
></mrow></math>
                                                                     

                                                                     
        holds for all <!--l. 3755--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>D</mi></mrow></math>
        with a density function, not just the <!--l. 3756--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>D</mi></mrow></math>
        which we happen to observe.
           </li>
        <li class="enumerate"><a 
  name="x59-85006x3"></a>The bound on <!--l. 3757--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><msub><mrow 
><mi 
>N</mi></mrow><mrow 
><mrow><mo 
class="MathClass-open">[</mo><mrow></mrow><mo 
class="MathClass-close">]</mo></mrow></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>H</mi><mo 
class="MathClass-punc">,</mo><mi 
>&#x03B3;</mi><mo 
class="MathClass-punc">,</mo><mi 
>D</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>
        is exactly the same as a bound on <!--l. 3757--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>N</mi> <mfenced separators="" 
open="("  close=")" ><mrow><mi 
>H</mi><mo 
class="MathClass-punc">,</mo> <mfrac><mrow 
><mi 
>&#x03B3;</mi></mrow> 
<mrow 
><mn>2</mn></mrow></mfrac> <mo 
class="MathClass-punc">,</mo><mi 
>D</mi></mrow></mfenced></mrow></math>.</li></ol>
<!--l. 3758--><p class="nopar"> In fact, the proof can be extended to <span 
class="ecti-1000">all  </span><!--l. 3759--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mi 
>D</mi></mrow></math>
(even ones with point masses) at the cost of a factor of <!--l. 3760--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mn>2</mn></mrow></math> worsening and
a messier argument. Property (2) is desirable because it is often not the case that we know the
distribution <!--l. 3762--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">        <mrow 
><mi 
>D</mi></mrow></math>
when we wish to apply the bound. Property (1) is an essential technique that
can be used to prove other covering number bounds for this notion of covering
number.
</p><!--l. 3766--><p class="indent">   Can we show partial order covering number bounds for other classifiers? There is a
straightforward extension of the previous proof for classifiers which consist of axis parallel intervals
in <!--l. 3768--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">        <mrow 
><msup><mrow 
><mi 
>R</mi></mrow><mrow 
><mi 
>n</mi></mrow></msup 
></mrow></math>.
More work is required to prove partial order covering numbers for the hypothesis spaces
of standard learning algorithms.
</p><!--l. 3772--><p class="indent">
                                                                     

                                                                     
</p>
   <div class="crosslinks"><p class="noindent">[<a 
href="thesisse43.xml" >next</a>] [<a 
href="thesisse41.xml" >prev</a>] [<a 
href="thesisse41.xml#tailthesisse41.xml" >prev-tail</a>] [<a 
href="thesisse42.xml" >front</a>] [<a 
href="thesisch9.xml#thesisse42.xml" >up</a>] </p></div><a 
  name="tailthesisse42.xml"></a>   
</body> 
</html> 
