Machine Learning, 10-701 and 15-781, 2005

Tom Mitchell and Andrew W. Moore
Center for Automated Learning and Discovery
School of Computer Science, Carnegie Mellon University

Fall 2005

It is hard to imagine anything more fascinating than systems that automatically improve their own performance through experience. Machine learning deals with computer algorithms for learning from many types of experience, ranging from robots exploring their environments, to mining pre-existing databases, to actively exploring and mining the web. This course is designed to give PhD students a thorough grounding in the methodologies, technologies, mathematics and algorithms needed to do research in learning and data mining, or to apply learning or data mining techniques to a target problem.

The topics of the course draw from classical statistics, from machine learning, from data mining, from Bayesian statistics and from statistical algorithmics.

Students entering the class with a pre-existing working knowledge of probability, statistics and algorithms will be at an advantage, but the class has been designed so that anyone with a strong numerate background can catch up and fully participate.

IF YOU ARE ON THE WAIT LIST: This class if now fully subscribed. You may want to consider the following options:

take the class when it is offered again in the spring semester
come to the first several lectures and see how the course develops. We will admit as many students from the waitlist as we can, once we see how many registered students drop the course during the first two weeks.

Class lectures: Tuesdays & Thursdays 10:30am-11:50am, Wean Hall 7500 starting on Tuesday September 13th, 2005

Review sessions: Thursdays 5-6pm, Location NSH 1305, starting on thursday September 15. TA's will cover material from lecture and the homeworks, and answer your questions. These review sessions are optional (but very helpful!).

Instructors:

Tom Mitchell, Wean Hall 5309, x8-2611, Office hours: by appointment through Sharon Cavlovich, sharon.cavlovich@cmu.edu
Andrew Moore, NSH 3117, x8-7599, Office hours: by appointment through Kristen Schrauder, kristens@cs.cmu.edu

Course secretary:

Sharon Cavlovich, sharon.cavlovich@cmu.edu , Wean Hall 5315, x8-5196

Teaching Assistants:

Jimeng Sun, Wean 5117, x8-3046, Office hours: Thursday 3:30-4:30PM
Ajit Singh, Wean 5130, x8-3052, Office hours: Tuesday 3:30-4:30PM
Mike Stilman, NSH 1612A, x8-7598, Office hours: Tuesday 1:00-2:00PM

Textbook:

Textbook (optional but strongly recommended): Machine Learning, Tom Mitchell.
Additional recommended (optional) textbook: Pattern Classification (2nd Edition), Duda, Hart and Stork.
Additional recommended (optional) textbook: Neural Networks for Pattern Recognition, Chris Bishop.
Additional readings will be made available as appropriate.

Grading:

Final grades will be based on midterm (20%), homework (30%), final project (20%), and final exam (30%).

Late homework policy:

You will be allowed 2 total late days without penalty for the entire semester. You may be late by 1 day on two different homeworks or late by 2 days on one homework. Once those days are used, you will be penalized according to the following policy:

Homework is worth full credit at the beginning of class on the due date.
It is worth half credit for the next 48 hours.
It is worth zero credit after that.

You must turn in all of the homeworks, even if for zero credit, in order to pass the course.
Turn in all late homework assignments to Sharon Cavlovich

Homework regrade policy:

If you feel that we have made an error in grading your homework, please turn in your homework with a written explanation to Sharon Cavlovich, and we will consider your request. Please note that regrading of a homework may cause your grade to go up or down.

Collaboration on Homeworks:

Homeworks will be done individually: each student must hand in their own answers. It is acceptable, however, for students to collaborate in figuring out answers and helping each other solve the problems. We assume that as PhD students you will be taking the responsibility to personally understand the solution to any work arising from such collaboration, and will write up your own final solution. You must write down on each homework the names of students with whom you collaborated.

Course project:

Project Guidance (out 10/11/05)
Datasets (out 10/11/05)

Exams:

There will be one midterm exam and a final exam. Both will be open book and open notes. Computers will not be allowed.
Final exam date: December 19 8:30-11:30a.m at HH B103 and HH B131 (Hammerschlag Hall) ( It will be impossible for us to accommodate individual requests to reschedule the final. Be sure you take this into account as you make travel plans for winter break.
Final review notes: the slides from Mike

Homeworks: Coming soon!

Tentative lecture schedule:

Module	Date	Lecture topic and readings	Lecturer	Homeworks
Optional warm-up	Thu Sep 8	Optional lecture: warm-up review of some basic probability concepts. Lecture: Basic Probability	Moore
Overview and a Machine Learning algorithm	Tu Sep 13	Machine Learning, Function Approximation, Decision Tree learning Reading: Machine Learning, Chapter 3, Decision Trees Lecture: Machine learning and Decision trees (pdf)	Mitchell
Review of probability, Maximum likelihood estimation, MAP estimation	Th Sep 15	Fast tour of useful concepts in probability Lecture: part1, part2, part3 Recitation 1: slides	Moore	HW1 pdf ps.gz Corrections Solutions
	Tu Sep 20	MLE and MAP estimation Lecture: slides	Moore
Linear models	Th Sep 22	Linear Regression and Basis Functions Lecture: slides	Moore
Naive Bayes	Tu Sep 27	Bayesian classifiers, Naive Bayes classifier, MLE and MAP estimates Lecture slides: Naive Bayes Required reading: Naive Bayes and Logistic Regression	Mitchell	HW1 due HW2 pdf train-1.txt test-1.txt plotGauss.m Solutions
Logistic regression Discriminative and Generative Models	Th Sep 29	Logistic regression, Generative and discriminative classifiers, maximizing conditional data likelihood, MLE and MAP estimates. Lecture slides: Logistic regression Required reading: Naive Bayes and Logistic Regression Optional reading: On Discriminative and Generative Classifiers, Ng and Jordan, NIPS, 2001.	Mitchell
Non-linear models Neural Networks	Tu Oct 4	Neural networks and gradient descent Lecture slides: neural networks Required reading: Machine Learning Chapter 4 Optional reading: Bishop chapter 9.1, 9.2	Mitchell
	Th Oct 6	Cross-validation and instance-based learning Lecture slides: overfitting instance-based Readings: Machine Learning Chapter 4. For a worked example of using cross-validation with gradient descent see the following paper (particularly the appendix): Memory-based learning, C. G. Atkeson, Memory-Based Approaches to Approximating Continuous Functions, Proceedings, Workshop on Nonlinear Modeling and Forecasting, Santa Fe, New Mexico, September 17-21, 1990 For more information about locally weighted methods see Locally Weighted Learning	Moore	HW2 due
Gaussian Mixture Models	Tu Oct 11	Cross-validation continued	Moore
	Th Oct 13	no lecture
Midterm Exam (solutions)	Tu Oct 18	Covers everything up to this date. Open book, notes. Closed computer. Come to class by 10.30am promptly. You will then have 80 minutes to answer six mostly-short questions on material covered in the lectures and readings up to and including October 11th. We strongly advise you to practice using previous exams, so you know what to expect. try doing the previous exams first, and then look at the solutions. You will be allowed to look at your notes in class, but don't rely on this because you will run out of time unless you are sufficiently familiar with the material that you can just do the questions without needing to look up the techniques. In addition, to help prepare, there will be a review at the recitation session at 5pm Thursday Oct 13th, and there will be another review on Monday Oct 17th, 6pm-7.30pm in NSH 1305. Previous examinations for practice.		Project proposals due
Computational learning theory	Th Oct 20	PAC Learning I: sample complexity, agnostic learning Reading: Machine Learning chapter 7 slides on PAC learning	Mitchell	HW3 ds2.txt Solution
	Tu Oct 25	PAC Learning II: VC dimension, SRM, Mistake bounds Reading: Machine Learning, chapter 7 slides on VCdimension and Mistake Bounds	Mitchell
Margin based approaches	Th Oct 27	SVMs, kernels, and optimization methods Reading: Burgess tutorial	Moore	RecitationHW3
Graphical Models	Tu Nov 1	Bayes nets: representation, conditional independence slides on Bayes Nets, slides with annotations Reading: Ghahramani tutorial (read section 2)	Mitchell	HW3 due
	Th Nov 3	Bayes nets: inference, variable elimination, etc. lecture notes1, notes2	Moore	Recitation
	Tu Nov 8	Bayes nets: learning parameters and structure (fully observed data, and begin EM)	Goldenberg
EM and semi-supervised learning	Th Nov 10	EM for Bayes networks and Mixtures of Gaussians slides	Mitchell
HMMs	Tu Nov 15	Hidden Markov Models: representation and learning lecture notes and annotated notes Reading (you should look at the first eight sides in detail)	Moore
Time series models	Th Nov 17	Graphical Models: an overview of more advanced probabistic models that fall under a category called Graphical Models. This lecture defines and talks about specifric instances, such as Kalman filters, undirected graphs and Dynamic Bayesian Networks	Goldenberg	Final project reports due
	Mon Nov 21	Project poster session: 4-6:30pm in the Newell-Simon Hall Atrium		Project poster session
Dimensionality reduction	Tu Nov 22	Dimensionality Reduction: Feature selection, PCA, SVD, ICA, Fisher discriminant slides	Mitchell
	Tu Nov 29	Advanced topic: Machine Learning and Text Analysis slides	Mitchell	HW4 missing.csv EM notes Inference notes Solutions
Markov models	Th Dec 1	Markov decision processes: Predicting the results of decisions in an uncertain world.	Moore
	Tu Dec 6	Reinforcement learning: Learning policies to maximize expected future rewards in an uncertain world. annotated slides reading: Machine Learning chapter 13	Moore
	Th Dec 8	Scaling: Some of Andrew's favorite data structures and algorithms for tractable statistical machine learning.	Moore	HW4 due
Final Exam	Monday Dec 19	December 19 8:30-11:30a.m at HH B103 and HH B131 (Hammerschlag Hall). No rescheduling possible. open book, open notes, closed computer.		HMM/MDP Review Dimension Reduction HMM

Web pages for earlier versions of this course: (include examples of midterms, homework questions, ...)

Course Website (this page):

http://www.cs.cmu.edu/~awm/10701
Corrections or comments to Jimeng Sun.

Note to people outside CMU: Please feel free to reuse any of these course materials that you find of use in your own courses. We ask that you retain any copyright notices, and include written notice indicating the source of any materials you use.

Machine Learning, 10-701 and 15-781, 2005

Fall 2005

Date

Lecture topic and readings

Tu Sep 13

Machine Learning, Function Approximation, Decision Tree learning

Tu Sep 20

Tu Sep 27