<?xml version="1.0"?> 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "mathml.dtd"> 
<?xml-stylesheet type="text/css" href="thesis.css"?> 
<html  
xmlns="http://www.w3.org/1999/xhtml"  
><head><title>3.6 Arbitrary Loss functions</title> 
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> 
<meta name="generator" content="TeX4ht (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html)" /> 
<meta name="originator" content="TeX4ht (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html)" /> 
<!-- 3,early_,early^,xhtml,mozilla --> 
<meta name="src" content="thesis.tex" /> 
<meta name="date" content="2002-08-28 13:56:00" /> 
<link rel="stylesheet" type="text/css" href="thesis.css" /> 
</head><body 
>
   <div class="crosslinks"><p class="noindent">[<a 
href="thesisse13.xml" >prev</a>] [<a 
href="thesisse13.xml#tailthesisse13.xml" >prev-tail</a>] [<a 
href="#tailthesisse14.xml">tail</a>] [<a 
href="thesisch3.xml#thesisse14.xml" >up</a>] </p></div>
   <h3 class="sectionHead"><span class="titlemark">3.6. </span> <a 
  name="x20-280003.6"></a>Arbitrary Loss functions</h3>
<!--l. 843--><p class="noindent">A loss function is <span 
class="ecti-1000">any </span>function which takes a hypothesis, <!--l. 843--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mi 
>h</mi></mrow></math>, and an
example <!--l. 844--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">        <mrow 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>x</mi><mo 
class="MathClass-punc">,</mo><mi 
>y</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math>
as input then outputs a real number. In particular, we could choose <!--l. 845--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mi 
>l</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi><mo 
class="MathClass-punc">,</mo><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>x</mi><mo 
class="MathClass-punc">,</mo><mi 
>y</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">=</mo> <mi 
>I</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>x</mi></mrow><mo 
class="MathClass-close">)</mo></mrow><mo 
class="MathClass-rel">&#x2260;</mo><mi 
>y</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow></math> and regard
most of the prior discussion as working with a specialized hamming loss function which is <!--l. 846--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mn>1</mn></mrow></math> when <!--l. 847--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mi 
>h</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>x</mi></mrow><mo 
class="MathClass-close">)</mo></mrow><mo 
class="MathClass-rel">&#x2260;</mo><mi 
>y</mi></mrow></math> and <!--l. 847--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mn>0</mn></mrow></math>
otherwise. Many other possibilities exist. For any bounded loss function, <!--l. 848--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mi 
>l</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi><mo 
class="MathClass-punc">,</mo><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>x</mi><mo 
class="MathClass-punc">,</mo><mi 
>y</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">&#x2208;</mo> <mrow><mo 
class="MathClass-open">[</mo><mrow><mn>0</mn><mo 
class="MathClass-punc">,</mo><mn>1</mn></mrow><mo 
class="MathClass-close">]</mo></mrow></mrow></math>, we can
define: <!--l. 849--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">        <mrow 
>
                                       <msub><mrow 
><mi 
>e</mi></mrow><mrow 
><mi 
>D</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">=</mo> <msub><mrow 
><mi 
>E</mi></mrow><mrow 
><mi 
>D</mi></mrow></msub 
><mi 
>l</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi><mo 
class="MathClass-punc">,</mo><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>x</mi><mo 
class="MathClass-punc">,</mo><mi 
>y</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow><mo 
class="MathClass-close">)</mo></mrow>
</mrow></math>
                                                                     

                                                                     
and <!--l. 852--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="display">        <mrow 
><msub><mrow 
>
                                       <mover 
accent="true"><mrow 
><mi 
>e</mi></mrow><mo 
class="MathClass-op">&#x0302;</mo></mover></mrow><mrow 
><mi 
>S</mi></mrow></msub 
><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi></mrow><mo 
class="MathClass-close">)</mo></mrow> <mo 
class="MathClass-rel">=</mo> <msub><mrow 
><mi 
>E</mi></mrow><mrow 
><mi 
>S</mi></mrow></msub 
><mi 
>l</mi><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>h</mi><mo 
class="MathClass-punc">,</mo><mrow><mo 
class="MathClass-open">(</mo><mrow><mi 
>x</mi><mo 
class="MathClass-punc">,</mo><mi 
>y</mi></mrow><mo 
class="MathClass-close">)</mo></mrow></mrow><mo 
class="MathClass-close">)</mo></mrow>
</mrow></math>
All of the bounds reported here will apply for loss functions bounded on the interval <!--l. 855--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mrow><mo 
class="MathClass-open">[</mo><mrow><mn>0</mn><mo 
class="MathClass-punc">,</mo><mn>1</mn></mrow><mo 
class="MathClass-close">]</mo></mrow></mrow></math>. The
fundamental advantage that this gives us is the ability to treat the hypothesis <span 
class="ecti-1000">not </span>as a
black box. Instead, we can derive a bound with a loss function dependent on the
structure of the hypothesis. This will be important later when discussing averaging
bounds (theorem  <a 
href="thesisse29.xml#x44-65001r1">7.1.1<!--tex4ht:ref: th-margin --></a>) where the bound will partly depend upon the structure of the
hypothesis.
</p><!--l. 861--><p class="indent">   For clarity of presentation, the bounds will all be presented using the hamming
loss. However, they will all apply to the more general setting of arbitrary <!--l. 863--><math 
xmlns="http://www.w3.org/1998/Math/MathML" 
mode="inline">
<mrow 
><mn>0</mn> <mo 
class="MathClass-bin">&#x2212;</mo> <mn>1</mn></mrow></math> loss
functions.
</p><!--l. 866--><p class="indent">
                                                                     

                                                                     
</p>
   <div class="crosslinks"><p class="noindent">[<a 
href="thesisse13.xml" >prev</a>] [<a 
href="thesisse13.xml#tailthesisse13.xml" >prev-tail</a>] [<a 
href="thesisse14.xml" >front</a>] [<a 
href="thesisch3.xml#thesisse14.xml" >up</a>] </p></div><a 
  name="tailthesisse14.xml"></a>  
</body> 
</html> 
