<?xml version="1.0"?> 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "mathml.dtd"> 
<?xml-stylesheet type="text/css" href="thesis.css"?> 
<html  
xmlns="http://www.w3.org/1999/xhtml"  
><head><title>2.1 Formal Model</title> 
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> 
<meta name="generator" content="TeX4ht (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html)" /> 
<meta name="originator" content="TeX4ht (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html)" /> 
<!-- 3,early_,early^,xhtml,mozilla --> 
<meta name="src" content="thesis.tex" /> 
<meta name="date" content="2002-08-28 13:56:00" /> 
<link rel="stylesheet" type="text/css" href="thesis.css" /> 
</head><body 
>
   <div class="crosslinks"><p class="noindent">[<a 
href="thesisse7.xml" >next</a>] [<a 
href="#tailthesisse6.xml">tail</a>] [<a 
href="thesisch2.xml#thesisse6.xml" >up</a>] </p></div>
   <h3 class="sectionHead"><span class="titlemark">2.1. </span> <a 
  name="x11-100002.1"></a>Formal Model</h3>
<!--l. 325--><p class="noindent">Let <!--l. 325--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">       <mrow 
><mi 
>X</mi></mrow></math> be the space of the
input to a predictor and <!--l. 325--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">       <mrow 
><mi 
>Y</mi> </mrow></math>
be the space of the output. A labeled example <!--l. 326--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>x</mi><mo 
class="MathClass-punc">,</mo><mi 
>y</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math> consists of an input, <!--l. 326--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mi 
>x</mi></mrow></math> and the desired
output, <!--l. 327--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">        <mrow 
><mi 
>y</mi></mrow></math>. Our
formal model starts with the assumption that all labeled examples are drawn independently from a
distribution <!--l. 328--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">        <mrow 
><mi 
>D</mi></mrow></math>
over the space <!--l. 329--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">       <mrow 
><mi 
>X</mi> <mo 
class="MathClass-bin">&#x00D7;</mo> <mi 
>Y</mi> </mrow></math>.
This is strictly more general than the &#x2019;target concept&#x2019; model which assumes that there exists some
function <!--l. 330--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">        <mrow 
><mi 
>f</mi> <mo 
class="MathClass-punc">:</mo> <mi 
>X</mi> <mo 
class="MathClass-rel">&#x2192;</mo> <mi 
>Y</mi> </mrow></math>
used to generate the label <span class="cite">[<a 
href="thesisli2.xml#XValiant"><span 
class="ecbx-1000">50</span></a>]</span>. In particular we can model
probabilistic learning problems which do not have a particular <!--l. 332--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mi 
>Y</mi> </mrow></math> value for
each <!--l. 332--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">        <mrow 
><mi 
>X</mi></mrow></math>
value. This generalization is essentially &#x201C;free&#x201D; in the sense that it does not add to the
complexity of presenting the results.
</p><!--l. 336--><p class="indent">   The set of <!--l. 336--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">        <mrow 
><mi 
>m</mi></mrow></math>
independently drawn samples presented to a learning algorithm will be denoted as <!--l. 337--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mi 
>S</mi></mrow></math>. The learning algorithm will
output a hypothesis <!--l. 338--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">       <mrow 
><mi 
>h</mi> <mo 
class="MathClass-punc">:</mo> <mi 
>X</mi> <mo 
class="MathClass-rel">&#x2192;</mo> <mi 
>Y</mi> </mrow></math>
which has some unobservable true error rate <!--l. 338--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><msub><mrow 
><mi 
>e</mi></mrow><mrow 
><mi 
>D</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math> and an observable
empirical error rate, <!--l. 339--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">       <mrow 
><msub><mrow 
><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo 
class="MathClass-op">&#x0302;</mo></mover></mrow><mrow 
><mi 
>S</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>.
</p>
   <div class="newtheorem">
<!--l. 341--><p class="noindent"><span class="head">
<a 
  name="x11-10001r1"></a>
  <span 
class="eccc-1000">D<small 
class="small-caps">E</small><small 
class="small-caps">F</small><small 
class="small-caps">I</small><small 
class="small-caps">N</small><small 
class="small-caps">I</small><small 
class="small-caps">T</small><small 
class="small-caps">I</small><small 
class="small-caps">O</small><small 
class="small-caps">N</small> </span>2.1.1<span 
class="eccc-1000">.</span></span>
</p><!--l. 342--><p class="indent">   (True              error)              The              true              error              <!--l. 342--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
                                                                     

                                                                     
<mrow 
><msub><mrow 
><mi 
>e</mi></mrow><mrow 
><mi 
>D</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>
of                              a                              hypothesis                              <!--l. 342--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mi 
>h</mi></mrow></math>
is            defined            in            the            following            way:            <!--l. 344--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">
<mrow 
>
                          <msub><mrow 
><mi 
>e</mi></mrow><mrow 
><mi 
>D</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">=</mo><msub><mrow 
><mo 
> Pr</mo></mrow><mrow 
><mi 
>D</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>x</mi></mrow><mo 
class="MathClass-close">)</mo></mrow><mo 
class="MathClass-rel">&#x2260;</mo><mi 
>y</mi></mrow><mo 
class="MathClass-close">)</mo></mrow>
</mrow></math>
</p>
   </div>
<!--l. 348--><p class="indent">   Unfortunately, the true error is not an observable quantity in our model because the distribution,
<!--l. 349--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">     <mrow 
><mi 
>D</mi></mrow></math>, is
unknown. However, there is a related quantity which is observable.
</p>
   <div class="newtheorem">
<!--l. 352--><p class="noindent"><span class="head">
<a 
  name="x11-10002r2"></a>
  <span 
class="eccc-1000">D<small 
class="small-caps">E</small><small 
class="small-caps">F</small><small 
class="small-caps">I</small><small 
class="small-caps">N</small><small 
class="small-caps">I</small><small 
class="small-caps">T</small><small 
class="small-caps">I</small><small 
class="small-caps">O</small><small 
class="small-caps">N</small> </span>2.1.2<span 
class="eccc-1000">.</span></span>
</p><!--l. 353--><p class="indent">   (Empirical Error) Given a sample set <!--l. 353--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>S</mi></mrow></math>,
the <span 
class="ecti-1000">empirical error</span>, <!--l. 353--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><msub><mrow 
><mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo 
class="MathClass-op">&#x0302;</mo></mover></mrow><mrow 
><mi 
>S</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>
                                                                     

                                                                     
is defined as: <!--l. 355--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">      <mrow 
><msub><mrow 
>
                        <mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo 
class="MathClass-op">&#x0302;</mo></mover></mrow><mrow 
><mi 
>S</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">=</mo><msub><mrow 
><mo 
> Pr</mo></mrow><mrow 
><mi 
>S</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>x</mi></mrow><mo 
class="MathClass-close">)</mo></mrow><mo 
class="MathClass-rel">&#x2260;</mo><mi 
>y</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">=</mo>  <mfrac><mrow 
><mn>1</mn></mrow> 
<mrow 
><mi 
>m</mi></mrow></mfrac><msubsup><mrow 
> <mo 
class="MathClass-op">&#x2211;</mo>
    </mrow><mrow 
><mi 
>i</mi><mo 
class="MathClass-rel">=</mo><mn>1</mn></mrow><mrow 
><mi 
>m</mi></mrow></msubsup 
><mi 
>I</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><msub><mrow 
><mi 
>x</mi></mrow><mrow 
>
<mi 
>i</mi></mrow></msub 
></mrow><mo 
class="MathClass-close">)</mo></mrow><mo 
class="MathClass-rel">&#x2260;</mo><msub><mrow 
><mi 
>y</mi></mrow><mrow 
><mi 
>i</mi></mrow></msub 
></mrow><mo 
class="MathClass-close">)</mo></mrow>
</mrow></math>
where <!--l. 357--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>I</mi><mrow><mo 
class="MathClass-open">(</mo><mrow></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>
is a function which maps &#x201C;true&#x201D; to <!--l. 357--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mn>1</mn></mrow></math>
and &#x201C;false&#x201D; to <!--l. 358--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mn>0</mn></mrow></math>.
Here <!--l. 358--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><msub><mrow 
><mo 
>Pr</mo></mrow><mrow 
><mi 
>S</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mo 
class="MathClass-punc">.</mo><mo 
class="MathClass-punc">.</mo><mo 
class="MathClass-punc">.</mo></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>
is  a  probability  taken  with  respect  to  the  uniform  distribution  over  the  set  of
examples, <!--l. 359--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">      <mrow 
><mi 
>S</mi></mrow></math>.
</p>
   </div>
<!--l. 362--><p class="indent">
                                                                     

                                                                     
</p>
   <div class="crosslinks"><p class="noindent">[<a 
href="thesisse7.xml" >next</a>] [<a 
href="thesisse6.xml" >front</a>] [<a 
href="thesisch2.xml#thesisse6.xml" >up</a>] </p></div><a 
  name="tailthesisse6.xml"></a>  
</body> 
</html> 
