<?xml version="1.0"?> 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "mathml.dtd"> 
<?xml-stylesheet type="text/css" href="thesis.css"?> 
<html  
xmlns="http://www.w3.org/1999/xhtml"  
><head><title>5 Microchoice Bounds (the algebra of choices)</title> 
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> 
<meta name="generator" content="TeX4ht (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html)" /> 
<meta name="originator" content="TeX4ht (http://www.cis.ohio-state.edu/~gurari/TeX4ht/mn.html)" /> 
<!-- 3,early_,early^,xhtml,mozilla --> 
<meta name="src" content="thesis.tex" /> 
<meta name="date" content="2002-08-28 13:56:00" /> 
<link rel="stylesheet" type="text/css" href="thesis.css" /> 
</head><body 
>
   <div class="crosslinks"><p class="noindent">[<a 
href="thesisch6.xml" >next</a>] [<a 
href="#tailthesisch5.xml">tail</a>] [<a 
href="thesispa2.xml#thesisch5.xml" >up</a>] </p></div>
   <h2 class="chapterHead"><span class="titlemark">Chapter&#x00A0;5</span><br /><a 
  name="x29-380005"></a>Microchoice Bounds (the algebra of choices)</h2>
<!--l. 1343--><p class="noindent">The work in this chapter is joint with Avrim Blum and was first presented at ICML <span class="cite">[<a 
href="thesisli2.xml#XMC"><span 
class="ecbx-1000">30</span></a>]</span>
and then in the Machine Learning Journal <span class="cite">[<a 
href="thesisli2.xml#XMC_journal"><span 
class="ecbx-1000">31</span></a>]</span>. The presentation here generalizes,
unifies, and improves the earlier work. Microchoice bounds can be thought of as
unifying &#x201C;Self-bounding Learning Algorithms&#x201D; <span class="cite">[<a 
href="thesisli2.xml#XSB"><span 
class="ecbx-1000">17</span></a>]</span> and the Occam&#x2019;s razor bound,
<span class="cite">[<a 
href="thesisli2.xml#XBEHW"><span 
class="ecbx-1000">5</span></a>]</span>.
</p><!--l. 1349--><p class="indent">   The bounds of the previous chapter do not use much structure on the hypothesis
space. Yet learning algorithms induce a natural structure. In particular, many learning
algorithms work by an iterative process in which they take a sequence of steps each from
a small set of choices (small in comparison to the overall hypothesis set size). Local
optimization algorithms such as hill-climbing or simulated annealing, for example, work
in this manner. Each step in a local optimization algorithm can be viewed as making a
choice from a small set of possible steps to take. If we take into account the number of
choices made and the size (and other properties) of each choice set, can we produce
a tighter bound? This chapter introduce microchoice bounds which use this
type of information to construct high confidence bounds on the future error
rate.
</p><!--l. 1361--><p class="indent">   The microchoice bounds can be thought of in several ways. The simple microchoice
bound (given in section  <a 
href="thesisse22.xml#x32-400005.2">5.2<!--tex4ht:ref: sec:mc --></a>) can be thought of as a well motivated application of the
Occam&#x2019;s Razor bound ( <a 
href="thesisse20.xml#x27-36001r1">4.6.1<!--tex4ht:ref: th-ORB --></a>). The idea is to use the learning algorithm itself to define
a description language for hypotheses, so that the description length of the
hypothesis actually produced gives a bound on the estimation error. The adaptive
                                                                     

                                                                     
microchoice theorem (in section  <a 
href="thesisse23.xml#x33-460005.3">5.3<!--tex4ht:ref: sec:query --></a>) can be thought of as a computationally tractable
adaptation of Self-bounding Learning algorithms <span class="cite">[<a 
href="thesisli2.xml#XSB"><span 
class="ecbx-1000">17</span></a>]</span>. Microchoice bounds tie
together and show the relationship between these different sample complexity
bounds.
</p><!--l. 1371--><p class="indent">   Microchoice bounds also provide insight into the nature of choices. In general, we
know that choice is &#x201C;bad&#x201D; for the purposes of creating a uniform bound on the true error
rate. The microchoice bounds give a quantitative understanding of how much choice is
&#x201C;bad&#x201D;. In particular, the log of the choice space size is the natural measure of &#x201C;badness&#x201D;.
This is directly related to the log of the hypothesis space size in the discrete hypothesis
bound ( <a 
href="thesisse16.xml#x23-32001r1">4.2.1<!--tex4ht:ref: th-DHSCP --></a>). There is also an indirect relationship with information theory where the
log of the alphabet size is an important parameter for specifying the number of bits
required to send a message.
</p><!--l. 1381--><p class="indent">   Viewed as an interactive proof of learning, microchoice bounds can be described
pictorially as in figure  <a 
href="#x29-380011">5.0.1<!--tex4ht:ref: fig-microchoice-protocol --></a>.
</p>
   <hr class="figure" /><div align="center" class="figure" 
><table class="figure"><tr class="figure"><td class="figure" 
>
                                                                     

                                                                     
<a 
  name="x29-380011"></a>
<!--l. 1385--><p class="indent">
                                                                     

                                                                     
</p><!--l. 1385--><p class="noindent"><img 
src="thesis5x.gif" alt="PIC" class="graphics" width="613.29124pt" height="404.51125pt"  /><!--tex4ht:graphics  
name="thesis5x.gif" src="thesis-presentation/microchoice.eps"  
-->
<br /> </p><div align="center" class="caption"><table class="caption" 
><tr valign="baseline" class="caption"><td class="id">Figure&#x00A0;5.0.1:  </td><td  
class="content"><a 
  name="x29-380011"></a>  Microchoice  bounds  collapse  the  two-round  Occam&#x2019;s  Razor  style
protocol into a one round protocol. This is (essentially) done by providing a compiler
for the verifier which takes a learning algorithm as input, and extracts a choice of
&#x201C;prior&#x201D;. </td></tr></table></div><!--tex4ht:label?: x29-380011 -->
                                                                     

                                                                     
   </td></tr></table></div><hr class="endfigure" />
<!--l. 1395--><p class="indent">   Important early work developing approximately self-bounding learning algorithms
was also done by Domingos <span class="cite">[<a 
href="thesisli2.xml#XDomingos"><span 
class="ecbx-1000">12</span></a>]</span>.
</p>
   <div class="sectionTOCS"><span class="sectionToc">&#x00A0;5.1.&#x00A0;&#x00A0;<a 
href="thesisse21.xml#x30-390005.1" name="QQ2-30-43">A                                                                       Motivating
Observation</a></span><br /><span class="sectionToc">&#x00A0;5.2.&#x00A0;&#x00A0;<a 
href="thesisse22.xml#x32-400005.2" name="QQ2-32-44">The Simple Microchoice Bound</a></span><br /><span class="subsectionToc">&#x00A0;&#x00A0;&#x00A0;5.2.1.&#x00A0;&#x00A0;<a 
href="thesisse22.xml#x32-410005.2.1" name="QQ2-32-45">Examples</a></span><br /><span class="subsectionToc">&#x00A0;&#x00A0;&#x00A0;5.2.2.&#x00A0;&#x00A0;<a 
href="thesisse22.xml#x32-440005.2.2" name="QQ2-32-48">Pruning
</a></span><br /><span class="subsectionToc">&#x00A0;&#x00A0;&#x00A0;5.2.3.&#x00A0;&#x00A0;<a 
href="thesisse22.xml#x32-450005.2.3" name="QQ2-32-49">Microchoice and Structural Risk Minimization</a></span><br /><span class="sectionToc">&#x00A0;5.3.&#x00A0;&#x00A0;<a 
href="thesisse23.xml#x33-460005.3" name="QQ2-33-50">Combining Microchoice with
Freund&#x2019;s Query Tree approach</a></span><br /><span class="subsectionToc">&#x00A0;&#x00A0;&#x00A0;5.3.1.&#x00A0;&#x00A0;<a 
href="thesisse23.xml#x33-470005.3.1" name="QQ2-33-51">Preliminaries and Definitions</a></span><br /><span class="subsectionToc">&#x00A0;&#x00A0;&#x00A0;5.3.2.&#x00A0;&#x00A0;<a 
href="thesisse23.xml#x33-480005.3.2" name="QQ2-33-52">Background
and Summary</a></span><br /><span class="subsectionToc">&#x00A0;&#x00A0;&#x00A0;5.3.3.&#x00A0;&#x00A0;<a 
href="thesisse23.xml#x33-490005.3.3" name="QQ2-33-53">Microchoice Bounds for Query Trees</a></span><br /><span class="subsectionToc">&#x00A0;&#x00A0;&#x00A0;5.3.4.&#x00A0;&#x00A0;<a 
href="thesisse23.xml#x33-500005.3.4" name="QQ2-33-54">Allowing batch
queries</a></span><br /><span class="subsectionToc">&#x00A0;&#x00A0;&#x00A0;5.3.5.&#x00A0;&#x00A0;<a 
href="thesisse23.xml#x33-510005.3.5" name="QQ2-33-55">Example: Batch Queries for Decision trees</a></span><br /><span class="subsectionToc">&#x00A0;&#x00A0;&#x00A0;5.3.6.&#x00A0;&#x00A0;<a 
href="thesisse23.xml#x33-520005.3.6" name="QQ2-33-56">Adaptive Microchoice
vs.&#x00A0;Basic Microchoice</a></span><br /><span class="subsectionToc">&#x00A0;&#x00A0;&#x00A0;5.3.7.&#x00A0;&#x00A0;<a 
href="thesisse23.xml#x33-530005.3.7" name="QQ2-33-57">Other Adaptive Microchoice</a></span><br /><span class="subsectionToc">&#x00A0;&#x00A0;&#x00A0;5.3.8.&#x00A0;&#x00A0;<a 
href="thesisse23.xml#x33-540005.3.8" name="QQ2-33-58">Comparison with Freund&#x2019;s
Self-Bounding algorithms</a></span><br /><span class="subsectionToc">&#x00A0;&#x00A0;&#x00A0;5.3.9.&#x00A0;&#x00A0;<a 
href="thesisse23.xml#x33-550005.3.9" name="QQ2-33-59">Choice Set Conglomeration </a></span><br /><span class="sectionToc">&#x00A0;5.4.&#x00A0;&#x00A0;<a 
href="thesisse24.xml#x36-560005.4" name="QQ2-36-60">Microchoice discussion</a></span><br />
   </div>



                                                                     

                                                                     
   <div class="crosslinks"><p class="noindent">[<a 
href="thesisch6.xml" >next</a>] [<a 
href="thesisch5.xml" >front</a>] [<a 
href="thesispa2.xml#thesisch5.xml" >up</a>] </p></div><a 
  name="tailthesisch5.xml"></a>  
</body> 
</html> 
