All publications sorted by date

Vision and Mobile Robotics Laboratory \| Publications
Home \| Members \| Projects \| Publications \| Software \| Videos \| Job opportunities	Internal

2006 2005 2004 2003 2002 2001 2000 1999 1998

1997

2006

Burcu Akinci, Frank Boukamp, Chris Gordon, Daniel Huber, Catherine Lyons, and Kuhn Park. A Formalism for Utilization of Sensor Systems and Integrated Project Models for Active Construction Quality Control. Automation in Construction, 15(2):124--138, February 2006. (url) (pdf)

Annotation: "Defects experienced during construction are costly and preventable. However, inspection programs employed today cannot adequately detect and manage defects that occur on construction sites, as they are based on measurements at specific locations and times, and are not integrated into complete electronic models. Emerging sensing technologies and project modeling capabilities motivate the development of a formalism that can be used for active quality control on construction sites. In this paper, we outline a process of acquiring and updating detailed design information, identifying inspection goals, inspection planning, as-built data acquisition and analysis, and defect detection and management. We discuss the validation of this formalism based on four case studies." .

(bibtex entry)
[bibtex-key = Akinci_2006_5375]

Kumar Sanjiv and Martial Hebert. Discriminative Random Fields. International Journal of Computer Vision, 68(2):179-202, 2006. (pdf)

Annotation: "In this research we address the problem of classification and labeling of regions given a single static natural image. Natural images exhibit strong spatial dependencies, and modeling these dependencies in a principled manner is crucial to achieve good classification accuracy. In this work, we present Discriminative Random Fields (DRFs) to model spatial interactions in images in a discriminative framework based on the concept of Conditional Random Fields proposed by Lafferty et al (Lafferty et al., 2001). The DRFs classify image regions by incor- porating neighborhood spatial interactions in the labels as well as the observed data. The DRF framework offers several advantages over the conventional Markov Random Field (MRF) framework. First, the DRFs allow to relax the strong assumption of conditional independence of the observed data generally used in the MRF framework for tractability. This assumption is too restrictive for a large number of applications in computer vision. Second, the DRFs derive their classification power by exploiting the probabilistic discriminative models instead of the generative models used for modeling observations in the MRF framework. Third, the interaction in labels in DRFs is based on the idea of pairwise discrimination of the observed data making it data-adaptive instead of being fixed a priori as in MRFs. Finally, all the parameters in the DRF model are estimated simultaneously from the training data unlike the MRF framework where the likelihood parameters are usually learned separately from the field parameters. We present preliminary experiments with man-made structure detection and binary image restoration tasks, and compare the DRF results with the MRF results." .

(bibtex entry)
[bibtex-key = Sanjiv_2006_5468]

James H. Hays, Marius Leordeanu, Alexei A. Efros, and Yanxi Liu. Discovering Texture Regularity via Higher-Order Matching. In 9th European Conference on Computer Vision, May 2006. (url) (pdf)

Annotation: "" .

(bibtex entry)
[bibtex-key = Hays_2006_5295]

Derek Hoiem, Alexei A. Efros, and Martial Hebert. Putting Objects in Perspective. In Proc. IEEE Computer Vision and Pattern Recognition (CVPR), June 2006. (url) (pdf)

Annotation: "Image understanding requires not only individually esti- mating elements of the visual world but also capturing the interplay among them. In this paper, we provide a frame- work for placing local object detection in the context of the overall 3D scene by modeling the interdependence of ob- jects, surface orientations, and camera viewpoint. Most object detection methods consider all scales and locations in the image as equally likely. We show that with probabilistic estimates of 3D geometry, both in terms of surfaces and world coordinates, we can put objects into perspective and model the scale and location variance in the image. Our approach reflects the cyclical nature of the problem by allowing probabilistic object hypotheses to re- fine geometry and vice-versa. Our framework allows pain- less substitution of almost any object detector and is easily extended to include other aspects of image understanding. Our results confirm the benefits of our integrated approach. " .

(bibtex entry)
[bibtex-key = Hoiem_2006_5467]

Marius Leordeanu and Martial Hebert. Efficient MAP approximation for dense energy functions. In International Conference on Machine Learning 2006, May 2006. (url) (pdf)

Annotation: "We present an efficient method for maximizing energy functions with first and second order potentials, suitable for MAP labeling estimation problems that arise in undirected graphical models. Our approach is to relax the integer constraints on the solution in two steps. First we efficiently obtain the relaxed global optimum following a procedure similar to the iterative power method for finding the largest eigenvector of a matrix. Next, we map the relaxed optimum on a simplex and show that the new energy obtained has a certain optimal bound. Starting from this energy we follow an efficient coordinate ascent procedure that is guaranteed to increase the energy at every step and converge to a solution that obeys the initial integral constraints. We also present a sufficient condition for ascent procedures that guarantees the increase in energy at every step. " .

(bibtex entry)
[bibtex-key = Leordeanu_2006_5437]

Caroline Pantofaru, Gyuri Dork·and Cordelia Schmid, and Martial Hebert. Combining Regions and Patches for Object Class Localization. In The Beyond Patches Workshop in conjunction with the IEEE conference on Computer Vision and Pattern Recognition, June 2006. (url) (pdf)

Annotation: "We introduce a method for object class detection and localization which combines regions generated by image segmentation with local patches. Region-based descriptors can model and match regular textures reliably, but fail on parts of the object which are textureless. They also cannot repeatably identify interest points on their boundaries. By incorporating information from patch-based descriptors near the regions into a new feature, the Region-based Context Feature (RCF), we can address these issues. We apply Region-based Context Features in a semi-supervised learning framework for object detection and localization. This framework produces object-background segmentation masks of deformable objects. Numerical results are presented for pixel-level performance. " .

(bibtex entry)
[bibtex-key = Pantofaru_2006_5432]

Andrew Stein and Martial Hebert. Local Detection of Occlusion Boundaries in Video. In British Machine Vision Conference, September 2006. (url) (pdf)

Annotation: "Occlusion boundaries are notoriously difficult for many patch-based computer vision algorithms, but they also provide potentially useful information about scene structure and shape. Using short video clips, we present a novel method for scoring the degree to which edges exhibit occlusion. We first utilize a spatio-temporal edge detector which estimates edge strength, orientation, and normal motion. By then extracting patches from either side of each detected (possibly moving) edglet, we can estimate and compare motion to determine if occlusion is present. This completely local, bottom-up approach is intended to provide powerful low-level information for use by higher-level reasoning methods." .

(bibtex entry)
[bibtex-key = Stein_2006_5472]

Andrew Stein and Martial Hebert. Using Spatio-Temporal Patches for Simultaneous Estimation of Edge Strength, Orientation, and Motion. In Beyond Patches Workshop at IEEE Conference on Computer Vision and Pattern Recognition, June 2006. (url) (pdf)

Annotation: "We describe an extension to ordinary patch-based edge detection in images using spatio-temporal volumetric patches from video. The inclusion of temporal information enables us to estimate motion normal to edges in addition to edge strength and spatial orientation. The method can handle complex edges in clutter by comparing distributions of data on either half of an extracted patch, rather than modeling the intensity profile of the edge. An efficient approach is provided for building the necessary histograms which samples candidate edge orientations and motions. Results are compared to classical spatio-temporal filtering techniques. " .

(bibtex entry)
[bibtex-key = Stein_2006_5404]

Andrew Stein, Andres Huertas, and Larry Matthies. Attenuating Stereo Pixel-Locking via Affine Window Adaptation. In IEEE International Conference on Robotics and Automation, May 2006. (url) (pdf)

Annotation: "For real-time stereo vision systems, the standard method for estimating sub-pixel stereo disparity given an initial integer disparity map involves fitting parabolas to a matching cost function aggregated over rectangular windows. This results in a phenomenon known as pixel-locking, which produces artificiallypeaked histograms of sub-pixel disparity. These peaks correspond to the introduction of erroneous ripples or waves in the 3D reconstruction of truly flat surfaces. Since stereo vision is a common input modality for autonomous vehicles, these inaccuracies can pose a problem for safe, reliable navigation. This paper proposes a new method for sub-pixel stereo disparity estimation, based on ideas from Lucas-Kanade tracking and optical flow, which substantially reduces the pixel-locking effect. In addition, it has the ability to correct much larger initial disparity errors than previous approaches and is more general as it applies not only to the ground plane. We demonstrate the method on synthetic imagery as well as real stereo data from an autonomous outdoor vehicle." .

(bibtex entry)
[bibtex-key = Stein_2006_5378]

Ranjith Unnikrishnan and Martial Hebert. Extracting Scale and Illuminant Invariant Regions Through Color. In 17th British Machine Vision Conference, September 2006. (url) (pdf)

Annotation: "Despite the fact that color is a powerful cue in object recognition, the extraction of scale-invariant interest regions from color images frequently begins with a conversion of the image to grayscale. The isolation of interest points is then completely determined by luminance, and the use of color is deferred to the stage of descriptor formation. This seemingly innocuous conversion to grayscale is known to suppress saliency and can lead to representative regions being undetected by procedures based only on luminance. Furthermore, grayscaled images of the same scene under even slightly different illuminants can appear sufficiently different as to affect the repeatability of detections across images. We propose a method that combines information from the color channels to drive the detection of scale-invariant keypoints. By factoring out the local effect of the illuminant using an expressive linear model, we demonstrate robustness to a change in the illuminant without having to estimate its properties from the image. Results are shown on challenging images from two commonly used color constancy datasets. " .

(bibtex entry)
[bibtex-key = Unnikrishnan_2006_5474]

Ranjith Unnikrishnan, Jean-Francois Lalonde, Nicolas Vandapel, and Martial Hebert. Scale Selection for the Analysis of Point-Sampled Curves. In Third International Symposium on 3D Processing, Visualization and Transmission (3DPVT 2006), June 2006. (url) (pdf)

Annotation: "An important task in the analysis and reconstruction of curvilinear structures from unorganized 3-D point samples is the estimation of tangent information at each data point. Its main challenges are in (1) the selection of an appropriate scale of analysis to accommodate noise, density variation and sparsity in the data, and in (2) the formulation of a model and associated objective function that correctly expresses their effects. We pose this problem as one of estimating the neighborhood size for which the principal eigenvector of the data scatter matrix is best aligned with the true tangent of the curve, in a probabilistic sense. We analyze the perturbation on the direction of the eigenvector due to finite samples and noise using the expected statistics of the scatter matrix estimators, and employ a simple iterative procedure to choose the optimal neighborhood size. Experiments on synthetic and real data validate the behavior predicted by the model, and show competitive performance and improved stability over leading polynomial-fitting alternatives that require a preset scale. " .

(bibtex entry)
[bibtex-key = Unnikrishnan_2006_5435]

Ranjith Unnikrishnan, Jean-Francois Lalonde, Nicolas Vandapel, and Martial Hebert. Scale Selection for the Analysis of Point Sampled Curves: Extended Report. Technical report CMU-RI-TR-06-25, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, June 2006. (url) (pdf)

(bibtex entry)
[bibtex-key = Unnikrishnan_2006_5450]

2005

Raghavendra Rao Donamukkala, Daniel Huber, Anuj Kapuria, and Martial Hebert. Automatic class selection and prototyping for 3-D object classification. In Proceedings of the 5th International Conference on 3-D Digital Imaging and Modeling (3DIM '05), pages 64-71, June 2005. IEEE Computer Society Press. (url) (pdf)
Abstract: "Most research on 3-D object classification and recognition focuses on recognition of objects in 3-D scenes from a small database of known 3-D models. Such an approach does not scale well to large databases of objects and does not generalize well to unknown (but similar) object classification. This paper presents two ideas to address these problems (i) class selection, i.e., grouping similar objects into classes (ii) class prototyping, i.e., exploiting common structure within classes to represent the classes. At run time matching a query against the prototypes is sufficient for classification. This approach will not only reduce the retrieval time but also will help increase the generalizing power of the classification algorithm. Objects are segmented into classes automatically using an agglomerative clustering algorithm. Prototypes from these classes are extracted using one of three class prototyping algorithms. Experimental results demonstrate the effectiveness of the two steps in speeding up the classification process without sacrificing accuracy." (bibtex entry)
[bibtex-key = donamukkala-3dim-05]
Chris Gordon, Burcu Akinci, Frank Boukamp, and Daniel Huber. Assessment of visualization software for support of construction site inspection tasks using data collected from reality capture technologies. In Proceedings of the 2005 International Conference on Computing in Civil Engineering, July 2005. American Society of Civil Engineers. (url) (pdf)
Abstract: "Emerging reality capture technologies, such as LADAR and embedded sensing, have potential to increase the efficiency of inspectors by generating detailed data about as-built conditions that can be analyzed in real time and at a later time at an office. The data collected using these systems provide an opportunity to visualize and analyze as-built conditions on construction sites in a more comprehensive way. At the same time, some characteristics of the data collected, such as its size and level of detail, provide unique visualization challenges. Currently available software systems deliver some functionalities to support construction inspection tasks by enabling the visualization and manipulation of the data captured using these reality capture technologies. In this paper, we provide an assessment of some functionalities provided by a set of selected visualization software tools based on the characteristics of the data collected and to be visualized, and on the construction inspection process that need to be supported." (bibtex entry)
[bibtex-key = Gordon_2005_5229]
Derek Hoiem, Alexei A. Efros, and Martial Hebert. Geometric Context from a Single Image. In International Conference of Computer Vision (ICCV), October 2005. IEEE. (url) (pdf)
Abstract: "Many computer vision algorithms limit their performance by ignoring the underlying 3D geometric structure in the image. We show that we can estimate the coarse geometric properties of a scene by learning appearance-based models of geometric classes, even in cluttered natural scenes. Geometric classes describe the 3D orientation of an image region with respect to the camera. We provide a multiple-hypothesis framework for robustly estimating scene structure from a single image and obtaining confidences for each geometric label. These confidences can then be used to improve the performance of many other applications. We provide a thorough quantitative evaluation of our algorithm on a set of outdoor images and demonstrate its usefulness in two applications: object detection and automatic single-view reconstruction." (bibtex entry)
[bibtex-key = hoiem-iccv-2005]
Yan Ke, Rahul Sukthankar, and Martial Hebert. Efficient Visual Event Detection using Volumetric Features. In International Conference on Computer Vision, October 2005. (url) (pdf)
Abstract: "This paper studies the use of volumetric features as an alternative to popular local descriptor approaches for event detection in video sequences. Motivated by the recent success of similar ideas in object detection on static images, we generalize the notion of 2D box features to 3D spatio-temporal volumetric features. This general framework enables us to do real-time video analysis. We construct a real-time event detector for each action of interest by learning a cascade of filters based on volumetric features that efficiently scans video sequences in space and time. This event detector recognizes actions that are traditionally problematic for interest point methods -- such as smooth motions where insufficient space-time interest points are available. Our experiments demonstrate that the technique accurately detects actions on real-world sequences and is robust to changes in viewpoint, scale and action speed. We also adapt our technique to the related task of human action classification and confirm that it achieves performance comparable to a current interest point based human activity recognizer on a standard database of human activities. " (bibtex entry)
[bibtex-key = ke-iccv-05]
Yan Ke, Rahul Sukthankar, and Martial Hebert. Efficient Temporal Mean Shift for Activity Recognition in Video. In NIPS Workshop on Activity Recognition and Discovery, 2005.
Abstract: "" (bibtex entry)
[bibtex-key = ke-nips-05]
Sanjiv Kumar and Martial Hebert. A Hierarchical Field Framework for Unified Context-Based Classification. In Proc. ICCV, October 2005. (url) (bibtex entry)
[bibtex-key = Kumar_2005_5291]
Sanjiv Kumar and Martial Hebert. A Hierarchical Field Framework for Unified Context-Based Classification. In International Conference on Computer Vision, 2005.
Abstract: "" (bibtex entry)
[bibtex-key = kumar-iccv-05]

Jean-Francois Lalonde, Ranjith Unnikrishnan, Nicolas Vandapel, and Martial Hebert. Scale Selection for Classification of Point-sampled 3-D Surfaces. In Fifth International Conference on 3-D Digital Imaging and Modeling (3DIM 2005), June 2005. (url) (pdf)
Keywords: scale selection, 3-d data, classification, ladar.

Annotation: "Three-dimensional ladar data are commonly used to perform scene understanding for outdoor mobile robots, specifically in natural terrain. One effective method is to classify points using features based on local point cloud distribution into surfaces, linear structures or clutter volumes. But the local features are computed using 3-D points within a support-volume. Local and global point density variations and the presence of multiple manifolds make the problem of selecting the size of this support volume, or scale, challenging. In this paper we adopt an approach inspired by recent developments in computational geometry and investigate the problem of automatic data-driven scale selection to improve point cloud classification. The approach is validated with results using data from different sensors in various environments classified into different terrain types (vegetation, solid surface and linear structure). " .

(bibtex entry)
[bibtex-key = Lalonde_2005_5070]

Jean-Francois Lalonde, Nicolas Vandapel, and Martial Hebert. Data Structure for Efficient Processing in 3-D. In Robotics: Science and Systems 1, June 2005. MIT Press. (url) (pdf)
Keywords: data structure, classification, 3-d data.

Annotation: "Autonomous navigation in natural environment requires three-dimensional (3-D) scene representation and interpretation. High density laser-based sensing is commonly used to capture the geometry of the scene, producing large amount of 3-D points with variable spatial density. We proposed a terrain classification method using such data. The approach relies on the computation of local features in 3-D using a support volume and belongs, as such, to a larger class of computational problems where range searches are necessary. This operation on traditional data structure is very expensive and, in this paper, we present an approach to address this issue. The method relies on reusing already computed data as the terrain classification process progresses over the environment representation. We present results that show significant speed improvement using ladar data collected in various environments with a ground mobile robot." .

(bibtex entry)
[bibtex-key = lalonde-rss-05]

Marius Leordeanu and Martial Hebert. A Spectral Technique for Correspondence Problems using Pairwise Constraints. In International Conference on Computer Vision, October 2005. (url) (pdf)
Abstract: "We present an efficient spectral method for finding consistent correspondences between two sets of features. We build the adjacency matrix M of a graph whose nodes represent the potential correspondences and the weights on the links represent pairwise agreements between potential correspondences. Correct assignments are likely to establish links among each other and thus form a strongly connected cluster. Incorrect correspondences establish links with the other correspondences only accidentally, so they are unlikely to belong to strongly connected clusters. We recover the correct assignments based on how strongly they belong to the main cluster of M, by using the principal eigenvector of M and imposing the mapping constraints required by the overall correspondence mapping (one-to-one or one-to-many). The experimental evaluation shows that our method is robust to outliers, accurate in terms of matching rate, while being several orders of magnitude faster than existing methods." (bibtex entry)
[bibtex-key = leordeanu-iccv-05]
Charles Rosenberg, Martial Hebert, and Henry Schneiderman. Semi-Supervised Self-Training of Object Detection Models. In Seventh IEEE Workshop on Applications of Computer Vision, January 2005. (pdf)
Abstract: "The construction of appearance-based object detection systems is time-consuming and difficult because a large number of training examples must be collected and manually labeled in order to capture variations in object appearance. Semi-supervised training is a means for reducing the effort needed to prepare the training set by training the model with a small number of fully labeled example and an additional set of unlabeled or weakly labeled examples. In this work we present a semi-supervised approach to training object detection systems based on self-training. We implement our approach as a wrapper around the training process of an existing detector and present empirical results. The key contributions of this empirical study is to demonstrate that a model trained in this manner can achieve results comparable to a model trained in the traditional manner using a much larger set of fully labeled data, and that a training data selection metric that is defined independently of the detector greatly outperforms a selection metric based on the confidence generated by the detector." (bibtex entry)
[bibtex-key = Rosenberg_2005_4875]

A. Stein and M. Hebert. Incorporating Background Invariance into Feature-Based Object Recognition. In Workshop on Applications of Computer Vision, 2005. (url) (pdf)
Keywords: SIFT, object recognition, BSIFT, invariant features, recognition.

Annotation: Current feature-based object recognition methods use information derived from local image patches. For robustness, features are engineered for invariance to various transformations, such as rotation, scaling, or affine warping. When patches overlap object boundaries, however, errors in both detection and matching will almost certainly occur due to inclusion of unwanted background pixels. This is common in real images, which often contain significant background clutter, objects which are not heavily textured, or objects which occupy a relatively small portion of the image. We suggest improvements to the popular Scale Invariant Feature Transform (SIFT) which incorporate local object boundary information. The resulting feature detection and descriptor creation processes are invariant to changes in background. We call this method the Background and Scale Invariant Feature Transform (BSIFT). We demonstrate BSIFT's superior performance in feature detection and matching on synthetic and natural images.

(bibtex entry)
[bibtex-key = stein-vacv-05]

John Tuley, Nicolas Vandapel, and Martial Hebert. Analysis and Removal of Artifacts in 3-D LADAR Data. In IEEE International Conference on Robotics and Automation, 2005. (pdf)

Annotation: "Errors in laser based range measurements can be divided into two categories: intrinsic sensor errors (range drift with temperature, systematic and random errors), and errors due to the interaction of the laser beam with the environment. The former have traditionally received attention and can be modeled. The latter in contrast have long been observed but not well characterized. We propose to do so in this paper. In addition, we present a sensor independent method to remove such artifacts. The objective is to improve the overall quality of 3-D scene reconstruction to perform terrain classification of scenes with vegetation.".

(bibtex entry)
[bibtex-key = tuley-icra-05]

Ranjith Unnikrishnan and Martial Hebert. Measures of Similarity. In Seventh IEEE Workshop on Applications of Computer Vision, pages 394-400, January 2005. (pdf)
Abstract: "Quantitative evaluation and comparison of image segmentation algorithms is now feasible owing to the recent availability of collections of hand-labeled images. However, little attention has been paid to the design of measures to compare one segmentation result to one or more manual segmentations of the same image. Existing measures in statistics and computer vision literature suffer either from intolerance to labeling refinement, making them unsuitable for image segmentation, or from the existence of degenerate cases, making the process of training algorithms using the measures to be prone to failure. This paper surveys previous work on measures of similarity and illustrates scenarios where they are applicable for performance evaluation in computer vision. For the image segmentation problem, we propose a measure that addresses the above concerns and has desirable properties such as accommodation of labeling errors at segment boundaries, region sensitive refinement, and compensation for differences in segment ambiguity between images. " (bibtex entry)
[bibtex-key = Unnikrishnan_2005_4874]

Ranjith Unnikrishnan, Caroline Pantofaru, and Martial Hebert. A Measure for Objective Evaluation of Image Segmentation Algorithms. In Proceedings of the 2005 IEEE Conference on Computer Vision and Pattern Recognition (CVPR '05), Workshop on Empirical Evaluation Methods in Computer Vision, June 2005. (url) (pdf)

Annotation: "Despite significant advances in image segmentation techniques, evaluation of these techniques thus far has been largely subjective. Typically, the effectiveness of a new algorithm is demonstrated only by the presentation of a few segmented images and is otherwise left to subjective evaluation by the reader. Little effort has been spent on the design of perceptually correct measures to compare an automatic segmentation of an image to a set of hand-segmented examples of the same image. This paper demonstrates how a modification of the Rand index, the Normalized Probabilistic Rand (NPR) index, meets the requirements of largescale performance evaluation of image segmentation. We show that the measure has a clear probabilistic interpretation as the maximum likelihood estimator of an underlying Gibbs model, can be correctly normalized to account for the inherent similarity in a set of ground truth images, and can be computed efficiently for large datasets. Results are presented on images from the publicly available Berkeley Segmentation dataset.".

(bibtex entry)
[bibtex-key = unnikrishnan-cvpr-05]

Nicolas Vandapel, James Kuffner, and Omead Amidi. Planning 3-D Path Networks in Unstructured Environments. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2005. (url)

Annotation: "In this paper, we explore the problem of three-dimensional motion planning in highly cluttered and unstructured outdoor environments. Because accurate sensing and modeling of obstacles is notoriously difficult in such environments, we aim to build computational tools that can handle large point data sets (e.g. LADAR data). Using a priori aerial data scans of forested environments, we compute a network of free space bubbles forming safe paths within environments cluttered with tree trunks, branches and dense foliage. The network (roadmap) of paths is used for efficiently planning paths that consider obstacle clearance information. We present experimental results on large point data sets typical of those faced by Unmanned Aerial Vehicles, but also applicable to ground-based robots navigating through forested environments." .

(bibtex entry)
[bibtex-key = vandapel-icra-05]

Hulya Yalcin, Robert Collins, Martial Hebert, and Michael J. Black. A Flow-Based Approach to Vehicle Detection and Background Mosaicking in Airborne Video. In Video Proceedings in conjunction with CVPR'05, June 2005. (url) (pdf)

Annotation: "In this work, we address the detection of vehicles in a video stream obtained from a moving airborne platform. We propose a Bayesian framework for estimating dense optical flow over time that explicitly estimates a persistent model of background appearance. The approach assumes that the scene can be described by background and occlusion layers, estimated within an Expectation-Maximization framework. The mathematical formulation of the paper is an extension of our previous work where motion and appearance models for foreground and background layers are estimated simultaneously in a Bayesian framework" .

(bibtex entry)
[bibtex-key = yalcin-cvpr-05]

H. Yalcin, R. Collins, and M. Hebert. Background Estimation under Rapid Gain Change in Thermal Imagery. In IEEE Workshop on Object Tracking and Classification in and Beyond the Visible Spectrum (OTCBVS'05), 2005. (bibtex entry)
[bibtex-key = yalcin-otcbvs-05]
Automatic Photo Pop-up. In ACM SIGGRAPH, August 2005. (url) (pdf)
Abstract: "This paper presents a fully automatic method for creating a 3D model from a single photograph. The model is made up of several texture-mapped planar billboards and has the complexity of a typical children?s pop-up book illustration. Our main insight is that instead of attempting to recover precise geometry, we statistically model geometric classes defined by their orientations in the scene. Our algorithm labels regions of the input image into coarse categories: ?ground?, ?sky?, and ?vertical?. These labels are then used to ?cut and fold? the image into a pop-up model using a set of simple assumptions. Because of the inherent ambiguity of the problem and the statistical nature of the approach, the algorithm is not expected to work on every image. However, it performs surprisingly well for a wide range of scenes taken from a typical person?s photo album."

Annotation: "AVI Video available at: http://www.cs.cmu.edu/~dhoiem/projects/popup/popup_movie_912_500_DivX.avi".

(bibtex entry)
[bibtex-key = hoiem-siggraph-05author="DerekHoiemandAlexeiA.EfrosandMartialHebert"©]

Sanjiv Kumar. Models for Learning Spatial Interactions in Natural Images for Context-Based Classification. Technical report CMU-CS-05-28, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, August 2005. (url) (pdf)

Annotation: "Classification of various image components (pixels, regions and objects) in meaningful categories is a challenging task due to ambiguities inherent to visual data. Natural images exhibit strong contextual dependencies in the form of spatial interactions among components. For example, neighboring pixels tend to have similar class labels, and different parts of an object are related through geometric constraints. Going beyond these, different regions e.g., sky and water, or objects e.g., monitor and keyboard appear in restricted spatial configurations. Modeling these interactions is crucial to achieve good classification accuracy. In this thesis, we present discriminative field models that capture spatial interactions in images in a discriminative framework based on the concept of Conditional Random Fields proposed by Lafferty et al. The discriminative fields offer several advantages over the Markov Random Fields (MRFs) popularly used in computer vision. First, they allow to capture arbitrary dependencies in the observed data by relaxing the restrictive assumption of conditional independence generally made in MRFs for tractability. Second, the interaction in labels in discriminative fields is based on the observed data, instead of being fixed a priori as in MRFs. This is critical to incorporate different types of context in images within a single framework. Finally, the discriminative fields derive their classification power by exploiting probabilistic discriminative models instead of the generative models used in MRFs. Since the graphs induced by the discriminative fields may have arbitrary topology, exact maximum likelihood parameter learning may not be feasible. We present an approach which approximates the gradients of the likelihood with simple piecewise constant functions constructed using inference techniques. To exploit different levels of contextual information in images, a two-layer hierarchical formulation is also described. It encodes both short-range interactions (e.g., pixelwise label smoothing) as well as long-range interactions (e.g., relative configurations of objects or regions) in a tractable manner. The models proposed in this thesis are general enough to be applied to several challenging computer vision tasks such as contextual object detection, semantic scene segmentation, texture recognition, and image denoising seamlessly within a single framework. ".

(bibtex entry)
[bibtex-key = Kumar_2005_5143]

Jean-Francois Lalonde, Ranjith Unnikrishnan, Nicolas Vandapel, and Martial Hebert. Scale Selection for Classification of Point-sampled 3-D Surfaces. Technical report CMU-RI-TR-05-01, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, January 2005. (pdf)
Keywords: scale selection, classification, 3-d data, ladar, data structure, classification, 3-d data.
Abstract: "Laser-based range sensors are commonly used on-board autonomous mobile robots for obstacle detection and scene understanding. A popular methodology for analyzing point cloud data from these sensors is to train Bayesian classifiers using locally computed features on labeled data and use them to compute class posteriors on-line at testing time. However, data from range sensors present a unique challenge for feature computation in the form of significant variation in spatial density of points, both across the field-of-view as well as within structures of interest. In particular, this poses the problem of choosing a scale for analysis and a support-region size for computing meaningful features reliably. While scale theory has been rigorously developed for 2-D images, no equivalent exists for unorganized 3-D point data. Choosing a satisfactory fixed scale over the entire dataset makes feature extraction sensitive to the presence of different manifolds in the data and varying data density. We adopt an approach inspired by recent developments in computational geometry and investigate the problem of automatic data-driven scale selection to improve point cloud classification. The approach is validated with results using real data from different sensors in various environments (indoor, urban outdoor and natural outdoor) classified into different terrain types (vegetation, solid surface and linear structure" (bibtex entry)
[bibtex-key = lalonde-cmu-tr-05]
Caroline Pantofaru and Martial Hebert. A Comparison of Image Segmentation Algorithms. Technical report CMU-RI-TR-05-40, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, September 2005. (url) (pdf)
Abstract: "Unsupervised image segmentation algorithms have matured to the point where they generate reasonable segmentations, and thus can begin to be incorporated into larger systems. A system designer now has an array of available algorithm choices, however, few objective numerical evaluations exist of these segmentation algorithms. As a first step towards filling this gap, this paper presents an evaluation of two popular segmentation algorithms, the mean shift-based segmentation algorithm and a graph-based segmentation scheme. We also consider a hybrid method which combines the other two methods. This quantitative evaluation is made possible by the recently proposed measure of segmentation correctness, the Normalized Probabilistic Rand (NPR) index, which allows a principled comparison between segmentations created by different algorithms, as well as segmentations on different images. For each algorithm, we consider its correctness as measured by the NPR index, as well as its stability with respect to changes in parameter settings and with respect to different images. An algorithm which produces correct segmentation results with a wide array of parameters on any one image, as well as correct segmentation results on multiple images with the same parameters, will be a useful, predictable and easily adjustable preprocessing step in a larger system. Our results are presented on the Berkeley image segmentation database, which contains 300 natural images along with several ground truth hand segmentations for each image. As opposed to previous results presented on this database, the algorithms we compare all use the same image features (position and colour) for segmentation, thereby making their outputs directly comparable. " (bibtex entry)
[bibtex-key = pantaforu-tr-05]

2004

Cristian Dima, Martial Hebert, and Anthony (Tony) Stentz. Enabling Learning From Large Datasets: Applying Active Learning to Mobile Robotics. In International Conference on Robotics and Automation, April 2004. IEEE. (pdf) (bibtex entry)
[bibtex-key = Dima_2004_4681]
Cristian Dima, Nicolas Vandapel, and Martial Hebert. Classifier Fusion for Outdoor Obstacle Detection. In International Conference on Robotics and Automation, April 2004. IEEE. (pdf)
Keywords: mobile robotics, obstacle detection. (bibtex entry)
[bibtex-key = Dima_2004_4680]
Andrea Frome, Daniel Huber, Ravi Kolluri, Thomas Bulow, and Jitendra Malik. Recognizing Objects in Range Data Using Regional Point Descriptors. In Proceedings of the European Conference on Computer Vision (ECCV), May 2004.
Keywords: shape contexts, spin images, 3-D recognition. (bibtex entry)
[bibtex-key = Frome_2004_4611]

G. Godin, J-F. Lalonde, and L. borgeat. Dual-Resolution Stereoscopic Display with Scene-Adaptive Fovea Boundarie. In International Immersive Projection Technology Workshop, 2004. (url)

Annotation: We present a multi-projector stereoscopic display which incorporates a high-resolution inset image, or fovea. The system uses four projectors, and the image warping required for on-screen image alignment and foveation is applied as part of the rendering pass. We discuss the problem of ambiguous depth perception between the boundaries of the inset in each eye and the underlying scene, and present a solution where the inset boundaries are dynamically adapted as a function of the scene geometry. An efficient real-time method for boundary adaptation is introduced. It is applied as a post-rendering step, does not require direct geometric computations on the scene, and is therefore practically independent of the size and complexity of the model. .

(bibtex entry)
[bibtex-key = godin-iipt-04]

G. Godin, J-F. Lalonde, and L. borgeat. Projector-based dual-resolution stereoscopic display. In IEEE Virtual Reality, 2004.

Annotation: We present a stereoscopic display system which incorporates a high-resolution inset image, or fovea. We describe the specific problem of false depth cues along the boundaries of the inset image, and propose a solution in which the boundaries of the inset image are dynamically adapted as a function of the geometry of the scene. This method produces comfortable stereoscopic viewing at a low additional computational cost. The four projectors need only be approximately aligned: a single drawing pass is required, regardless of projector alignment, since the warping is applied as part of the 3-D rendering process.

(bibtex entry)
[bibtex-key = godin-vr-04]

Derek Hoiem, Rahul Sukthankar, Henry Schneiderman, and Larry Huston. Object-Based Image Retrieval using the Statistical Structure of Images. In IEEE Conference on Computer Vision and Pattern Recognition, June 2004. (url)
Keywords: object recognition, statistical modeling.
Abstract: "We propose a new Bayesian approach to object-based image retrieval with relevance feedback. Although estimating the object posterior probability density from few examples seems infeasible, we are able to approximate this density by exploiting statistics of the image database domain. Unlike previous approaches that assume an arbitrary distribution for the unconditional density of the feature vector (the density of the features taken over the entire image domain), we learn both the structure and the parameters of this density. These density estimates enable us to construct a Bayesian classifier. Using this Bayesian classifier, we perform a windowed scan over images for objects of interest and employ the user?s feedback on the search results to train a second classifier that focuses on eliminating difficult false positives. We have incorporated this algorithm into an object-based image retrieval system. We demonstrate the effectiveness of our approach with experiments using a set of categories from the Corel database." (bibtex entry)
[bibtex-key = Hoiem_2004_4644]
Daniel Huber, Anuj Kapuria, Raghavendra Rao Donamukkala, and Martial Hebert. Parts-based 3-D object classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 04), June 2004.
Note: This work was supported by the DARPA E3-D program (F33615-02-C-1265). (pdf)
Keywords: object classification, parts-based classification, generic object recognition, 3-D recognition. (bibtex entry)
[bibtex-key = Huber_2004_4691]
Yanxi Liu, Leonid Teverovskiy, Owen Carmichael, R. Kikinis, M. Shenton, C.S. Carter, V.A. Stenger, S. Davis, Howard Aizenstein, Jim Becker, Oscar Lopez, and Carolyn Meltzer. Discriminative MR Image Feature Analysis for Automatic Schizophrenia and Alzheimer's Disease Classification. In Proceedings of the 7th International Conference on MedicalImage Computing and Computer Aided Intervention (MICCAI '04), October 2004. (url)
Keywords: medical applications, medical imaging. (bibtex entry)
[bibtex-key = Liu_2004_4678]
Bart Nabbe, Sanjiv Kumar, and Martial Hebert. Path Planning with Hallucinated Worlds. In Proceedings: IEEE/RSJ International Conference on Intelligent Robots and Systems, October 2004. IEEE. (url) (pdf)
Abstract: "We describe an approach that integrate mid-range sensing into a dynamic path planning algorithm. The algorithm is based on measuring the reduction in path cost that would be caused by taking a sensor reading from candidate locations. The planner uses this measure in order to decide where to take the next sensor reading. Ideally, one would like to evaluate a path based on a map that is as close as possible to the true underlying world. In practice, however, the map is only sparsely populated by data derived from sensor readings. A key component of the approach described in this paper is a mechanism to infer (or "hallucinate") more complete maps from sparse sensor readings. We show how this hallucination mechanism is integrated with the planner to produce better estimates of the gain in path cost occurred when taking sensor readings. We show results on a real robot as well as a statistical analysis on a large set of randomly generated path planning problems on elevation maps from real terrain." (bibtex entry)
[bibtex-key = Nabbe_2004_4821]
Henry Schneiderman. Learning a Restricted Bayesian Network for Object Detection. In IEEE Conference on Computer Vision and Pattern Recognition, June 2004. IEEE. (url)
Abstract: "Many classes of images have the characteristics of sparse structuring of statistical dependency and the presence of conditional independencies among various groups of variables. Such characteristics make it possible to construct a powerful classifier by only representing the stronger direct dependencies among the variables. In particular, a Bayesian network compactly represents such structuring. However, learning the structure of a Bayesian network is known to be NP complete. The high dimensionality of images makes structure learning especially challenging. This paper describes an algorithm that searches for the structure of a Bayesian network based classifier in this large space of possible structures. The algorithm seeks to optimize two cost functions: a localized error in the log-likelihood ratio function to restrict the structure and a global classification error to choose the final structure of the Network. The final network structure is restricted such that the search can take advantage of pre-computed estimates and evaluations. We use this method to automatically train detectors of frontal faces, eyes, and the iris of the human eye. In particular, the frontal face detector achieves state-of-the-art performance on the MIT-CMU test set for face detection." (bibtex entry)
[bibtex-key = Schneiderman_2004_4688]
Henry Schneiderman. Feature-Centric Evaluation for Efficient Cascaded Object Detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2004. IEEE. (url)
Abstract: "We describe a cascaded method for object detection. This approach uses a novel organization of the first cascade stage called "feature-centric" evaluation which re-uses feature evaluations across multiple candidate windows. We minimize the cost of this evaluation through several simplifications: (1) localized lighting normalization, (2) representation of the classifier as an additive model and (3) discrete-valued features. Such a method also incorporates a unique feature representation. The early stages in the cascade use simple fast feature evaluations and the later stages use more complex discriminative features. In particular, we propose features based on sparse coding and ordinal relationships among filter responses. This combination of cascaded feature-centric evaluation with features of increasing complexity achieves both computational efficiency and accuracy. We describe object detection experiments on ten objects including faces and automobiles. These results include 97% recognition at equal error rate on the UIUC image database for car detection." (bibtex entry)
[bibtex-key = Schneiderman_2004_4687]
Y. Shan, B. Matei, H. S. Sawhney, R. Kumar, Daniel Huber, and Martial Hebert. Linear Model Hashing and Batch RANSAC for Rapid and Accurate Object Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2004), June 2004. (pdf)
Keywords: object recognition, 3-D recognition, articulated ICP. (bibtex entry)
[bibtex-key = shan-cvpr-04]
Nicolas Vandapel and Martial Hebert. Finding Organized Structures in 3-D LADAR Data. In Army Science Conference, November 2004. (pdf)
Keywords: concertina wire detection, ladar.
Abstract: "In this paper, we address the problem of finding concertina wire in three-dimensional (3-D) data. Wire entanglements constitute a major obstacle to the mobility of Unmanned Ground Vehicle because of their widespread use and the difficulty to detect them. We pose the problem in term of finding thin structures organized in complex patterns. Such problem did not received as much attention as linear and planar structures segmentation. We are interested especially in the problems posed by repetitive and symmetric structures acquired with a laser range finder. The method relies on 3-D data projections along specific directions and 2-D histograms comparison. The sensitivity of the classification algorithm to the parameter settings is evaluated and a segmentation method proposed. This paper is an extended version of our IROS 2004 paper." (bibtex entry)
[bibtex-key = Vandapel_2004_4778]
Nicolas Vandapel and Martial Hebert. Finding Organized Structures in 3-D Ladar Data. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2004. (pdf) (bibtex entry)
[bibtex-key = vandapel-iros-04]
Nicolas Vandapel, Daniel Huber, Anuj Kapuria, and Martial Hebert. Natural Terrain Classification using 3-D Ladar Data. In IEEE International Conference on Robotics and Automation, April 2004. (pdf) (bibtex entry)
[bibtex-key = vandapel-icra-04]
Yanxi Liu, Leonid Teverovskiy, Owen Carmichael, R. Kikinis, M. Shenton, C. S. Carter, V. A. Stenger, S. Davis, Howard Aizenstein, Jim Becker, Oscar Lopez, and Carolyn Meltzer. Discriminative MR Image Feature Analysis for Automatic Schizophrenia and Alzheimer's Disease Classification. Technical report CMU-RI-TR-04-15, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, March 2004. (url) (pdf)
Keywords: medical applications, medical imaging. (bibtex entry)
[bibtex-key = Liu_2004_4625]

2003

Owen Carmichael. Discriminative Techniques For The Recognition Of Complex-Shaped Objects. PhD thesis, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, September 2003. (url) (pdf)
Keywords: object recognition, computer vision.
Abstract: "This thesis presents new techniques which enable the automatic recognition of everyday objects like chairs and ladders in images of highly cluttered scenes. Given an image, we extract information about the shape and texture properties present in small patches of the image and use that information to identify parts of the objects we are interested in. We then assemble those parts into overall hypotheses about what objects are present in the image, and where they are. Solving this problem in a general setting is one of the central problems in computer vision, as doing so would have an immediate impact on a far-reaching set of applications in medicine, surveillance, manufacturing, robotics, and other areas. The central theme of this work is that formulating object recognition as a discrimination problem can ease the burden of system design. In particular, we show that thinking of recognition in terms of discriminating between objects and clutter, rather than separately modeling the appearances of objects and clutter, can simplify the processes of extracting information from the image and identifying which parts of the image correspond with parts of objects. The bulk of this thesis is concerned with recognizing "wiry" objects in highly-cluttered images; an example problem is finding ladders in images of a messy warehouse space. Wiry objects are distinguished by a prevalence of very thin, elongated, stick-like components; examples include tables, chairs, bicycles, and desk lamps. They are difficult to recognize because they tend to lack distinctive color or texture characteristics and their appearance is not easy to describe succinctly in terms of rectangular patches of image pixels. Here, we present a set of algorithms which extends current capabilities to find wiry objects in highly cluttered images across changes in the clutter and object pose. Specifically, we present discrimination-centered techniques for extracting shape features from portions of images, classifying those features as belonging to an object of interest or not, and aggregating found object parts together into overall instances of objects. Moreover, we present a suite of experiments on real, wiry objects a chair, cart, ladder, and stool respectively which substantiates the utility of these methods and explores their behavior. The second part of the thesis presents a technique for extracting texture features from images in such a way that features from objects of interest are both well-clustered with each other and well-separated from the features from clutter. We present an optimization framework for automatically combining existing texture features into features that discriminate well, thus simplifying the process of tuning the parameters of the feature extraction process. This approach is substantiated in recognition experiments on real objects in real, cluttered images." (bibtex entry)
[bibtex-key = Carmichael_2003_4628]
Peng Chang. Robust Tracking and Structure from Motion with Sampling Method. PhD thesis, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, February 2003. (bibtex entry)
[bibtex-key = Chang_2003_4296]
Daniel Huber and Martial Hebert. Fully automatic registration of multiple 3-D data sets. Image and Vision Computing, 21(7):637-650, July 2003.
Keywords: geometric modeling, registration, surface matching, automatic modeling. (bibtex entry)
[bibtex-key = Huber_2003_4470]
Sanjiv Kumar, Alex C. Loui, and Martial Hebert. An Observation-Constrained Generative Approach for Probabilistic Classification of Image Regions. Image and Vision Computing, 21:87-97, 2003. (pdf) (bibtex entry)
[bibtex-key = Kumar_2003_4594]
Shyjan Mahamud, Lance R. Williams, Karvel K. Thornber, and Kanglin Xu. Segmentation of Multiple Salient Closed Contours from Real Images. IEEE Trans. on Pattern Analysis and Machine Intelligence, 25(4), April 2003. (url) (pdf)
Abstract: "Using a saliency measure based on the global property of contour closure, we have developed a segmentation method which identifies smooth closed contours bounding objects of unknown shape in real images. The saliency measure incorporates the Gestalt principles of proximity and good continuity that previous methods have also exploited. Unlike previous methods, we incorporate contour closure by finding the eigenvector with the largest positive real eigenvalue of a transition matrix for a Markov process where edges from the image serve as states. Element (i, j) of the transition matrix is the conditional probability that a contour which contains edge j will also contain edge i. In this paper, we show how the saliency measure, defined for individual edges, can be used to derive a saliency relation, defined for pairs of edges, and further show that strongly-connected components of the graph representing the saliency relation correspond to smooth closed contours in the image. Finally, we report for the first time, results on large real images for which segmentation takes an average of about 10 seconds per object on a general-purpose workstation." (bibtex entry)
[bibtex-key = Mahamud_2003_4707]
Owen Carmichael and Martial Hebert. Shape-based Recognition Of Wiry Objects. In IEEE Conference On Computer Vision And Pattern Recognition, June 2003. IEEE Press. (url) (pdf)
Keywords: object recognition, computer vision.
Abstract: "We present an approach to the recognition of complex-shaped objects in cluttered environments based on edge cues. We first use example images of the desired object in typical backgrounds to train a classifier cascade which determines whether edge pixels in an image belong to an instance of the object or the clutter. Presented with a novel image, we use the cascade to discard clutter edge pixels. The features used for this classification are localized, sparse edge density operations. Experiments validate the effectiveness of the technique for recognition of complex objects in cluttered indoor scenes under arbitrary out-of-image-plane rotation." (bibtex entry)
[bibtex-key = Carmichael_2003_4386]
Cristian Dima, Nicolas Vandapel, and Martial Hebert. Sensor and Classifier Fusion for Outdoor Obstacle Detection: an Application of Data Fusion To Autonomous Off-Road Navigation. In The 32nd Applied Imagery Recognition Workshop (AIPR2003), October 2003. IEEE Computer Society. (pdf) (bibtex entry)
[bibtex-key = Dima_2003_4582]
C. Gordon, F. Boukamp, Daniel Huber, Edward Latimer, K. Park, and B. Akinci. Combining Reality Capture Technologies for Construction Defect Detection: A Case Study. In EIA9: E-Activities and Intelligent Support in Design and the Built Environment, 9th EuropIA International Conference, pages 99-108, October 2003.
Keywords: 3-D perception, geometric modeling, object recognition, defect detection, construction site modeling. (bibtex entry)
[bibtex-key = Gordon_2003_4670]
Daniel Huber and Martial Hebert. 3-D Modeling Using a Statistical Sensor Model and Stochastic Search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 858-865, June 2003. (pdf)
Keywords: 3-D perception, geometric modeling, 3-D modeling, registration, surface matching, automatic modeling. (bibtex entry)
[bibtex-key = Huber_2003_4427]
Daniel Huber and Nicolas Vandapel. Automatic 3-D underground mine mapping. In International Conference on Field and Service Robotics, July 2003.
Keywords: 3-D perception, geometric modeling, 3-D modeling, registration, mine mapping. (bibtex entry)
[bibtex-key = Huber_2003_4414]
Alonzo Kelly and Ranjith Unnikrishnan. Efficient Construction of Globally Consistent Ladar Maps using Pose Network Topology and Nonlinear Programming. In Proceedings of the 11th International Symposium of Robotics Research (ISRR '03), November 2003. (bibtex entry)
[bibtex-key = Kelly_2003_4587]
Sanjiv Kumar and Martial Hebert. Discriminative Fields for Modeling Spatial Dependencies in Natural Images. In in proc. advances in Neural Information Processing Systems (NIPS), December 2003. (pdf) (bibtex entry)
[bibtex-key = Kumar_2003_4597]
Sanjiv Kumar and Martial Hebert. Discriminative Random Fields: A Discriminative Framework for Contextual Interaction in Classification. In Proceedings of the 2003 IEEE International Conference on Computer Vision (ICCV '03), volume 2, pages 1150-1157, 2003. (pdf) (bibtex entry)
[bibtex-key = Kumar_2003_4596]
Sanjiv Kumar and Martial Hebert. Man-Made Structure Detection in Natural Images using a Causal Multiscale Random Field. In in proc. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), volume 1, pages 119-126, 2003. (pdf) (bibtex entry)
[bibtex-key = Kumar_2003_4595]
Shyjan Mahamud and Martial Hebert. Minimum Risk Distance Measure for Object Recognition. In IEEE International Conference on Computer Vision (ICCV), 2003. (url) (pdf)
Abstract: "Recently, the optimal distance measure for a given object discrimination task under the nearest neighbor framework was derived. For ease of implementation and efficiency considerations, the optimal distance measure was approximated by combining more elementary distance measures defined on simple feature spaces. In this paper, we address two important issues that arise in practice for such an approach: (a) What form should the elementary distance measure in each feature space take? We motivate the need to use optimal distance measures in simple feature spaces as the elementary distance measures; such distance measures have the desirable property that they are invariant to distance-respecting transformations. (b) How do we combine the elementary distance measures? We present the precise statistical assumptions under which a linear logistic model holds exactly. We benchmark our model with three other methods on a challenging face discrimination task and show that our approach is competitive with the state of the art." (bibtex entry)
[bibtex-key = Mahamud_2003_4706]
Shyjan Mahamud and Martial Hebert. The Optimal Distance Measure for Object Detection. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2003. (url) (pdf)
Abstract: "We develop a multi-class object detection framework whose core component is a nearest neighbor search over object part classes. The performance of the overall system is critically dependent on the distance measure used in the nearest neighbor search. A distance measure that minimizes the mis-classification risk for the 1-nearest neighbor search can be shown to be the probability that a pair of input image measurements belong to different classes. In practice, we model the optimal distance measure using a linear logistic model that combines the discriminative powers of more elementary distance measures associated with a collection of simple to construct feature spaces like color, texture and local shape properties. Furthermore, in order to perform search over large training sets efficiently, the same framework was extended to find hamming distance measures associated with simple discriminators. By combining this discrete distance model with the continuous model, we obtain a hierarchical distance model that is both fast and accurate. Finally, the nearest neighbor search over object part classes was integrated into a whole object detection system and evaluated against an indoor detection task yielding good results." (bibtex entry)
[bibtex-key = Mahamud_2003_4708]
Aaron Christopher Morris, Raghavendra Rao Donamukkala, Anuj Kapuria, Aaron M Steinfeld, J. Matthews, J. Dunbar-Jacobs, and Sebastian Thrun. Robotic Walker that Provides Guidance. In Proceedings of the 2003 IEEE Conference on Robotics and Automation (ICRA '03), May 2003. (bibtex entry)
[bibtex-key = Morris_2003_4476]
Aaron Christopher Morris, Derek Kurth, Daniel Huber, Chuck Whittaker, and Scott Thayer. Case Studies of a Borehole Deployable Robot for Limestone Mine Profiling and Mapping. In International Conference on Field and Service Robotics (FSR), July 2003.
Keywords: 3-D perception, geometric modeling, 3-D modeling, registration, mine mapping. (bibtex entry)
[bibtex-key = Morris_2003_4428]
Bart Nabbe and Martial Hebert. Where and When to Look. In IROS 2003, October 2003. IEEE. (pdf)
Keywords: outdoor navigation, dynamic planning, mid-range sensing, wide baseline stereo. (bibtex entry)
[bibtex-key = Nabbe_2003_4581]
Caroline Pantofaru, Ranjith Unnikrishnan, and Martial Hebert. Toward Generating Labeled Maps from Color and Range Data for Robot Navigation. In Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 2003. (pdf)
Keywords: 3-D data, object recognition, image segmentation, sensor fusion, scene understanding.
Abstract: "This paper addresses the problem of extracting information from range and color data acquired by a mobile robot in urban environments. Our approach extracts geometric structures from clouds of 3-D points and regions from the corresponding color images, labels them based on prior models of the objects expected in the environment - buildings in the current experiments - and combines the two sources of information into a composite labeled map. Ultimately, our goal is to generate maps that are segmented into objects of interest, each of which is labeled by its type, e.g., buildings, vegetation, etc. Such a map provides a higher-level representation of the environment than the geometric maps normally used for mobile robot navigation. The techniques presented here are a step toward the automatic construction of such labeled maps." (bibtex entry)
[bibtex-key = pantofaru-iros-03]
Henry Schneiderman. Learning Statistical Structure for Object Detection. In Computer Analysis of Images and Patterns (CAIP), 2003, August 2003. Springer-Verlag. (url)
Abstract: "Many classes of images exhibit sparse structuring of statistical dependency. Each variable has strong statistical dependency with a small number of other variables and negligible dependency with the remaining ones. Such structuring makes it possible to construct a powerful classifier by only representing the stronger dependencies among the variables. In particular, a semi-naïve Bayes classifier compactly represents sparseness. A semi-naïve Bayes classifier decomposes the input variables into subsets and represents statistical dependency within each subset, while treating the subsets as statistically inde-pendent. However, learning the structure of a semi-naïve Bayes classifier is known to be NP complete. The high dimensionality of images makes statistical structure learning especially challenging. This paper describes an algorithm that searches for the structure of a semi-naïve Bayes classifier in this large space of possible structures. The algorithm seeks to optimize two cost functions: a localized error in the log-likelihood ratio function to restrict the structure and a global classification error to choose the final structure. We use this approach to train detectors for several objects including faces, eyes, ears, telephones, push-carts, and door-handles. These detectors perform robustly with a high detection rate and low false alarm rate in unconstrained settings over a wide range of variation in background scenery and lighting." (bibtex entry)
[bibtex-key = Schneiderman_2003_4413]
Ranjith Unnikrishnan and Martial Hebert. Robust Extraction of Multiple Structures from Non-uniformly Sampled Data. In Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '03), volume 2, pages 1322-29, October 2003. (pdf)
Keywords: mobile robot, 3-D data, object recognition, nonparametric statistics, robust estimation, scene understanding.
Abstract: "The extraction of multiple coherent structures from point clouds is crucial to the problem of scene modeling. While many statistical methods exist for robust estimation from noisy data, they are inadequate for addressing issues of scale, semi-structured clutter, and large point density variation together with the computational restrictions of autonomous navigation. This paper extends an approach of nonparametric projection-pursuit based regression to compensate for the non-uniform and directional nature of data sampled in outdoor environments. The proposed algorithm is employed for extraction of planar structures and clutter grouping. Results are shown for scene abstraction of 3-D range data in large urban scenes." (bibtex entry)
[bibtex-key = Unnikrishnan_2003_4589]
Nicolas Vandapel, Raghavendra Rao Donamukkala, and Martial Hebert. Quality Assessment of Traversability Maps from Aerial LIDAR Data for an Unmanned Ground Vehicle. In International Conference on Intelligent Robots and Systems (IROS), October 2003. (pdf) (bibtex entry)
[bibtex-key = vandapel-iros-03]
Nicolas Vandapel, Raghavendra Rao Donamukkala, and Martial Hebert. Experimental Results in Using Aerial LADAR Data for Mobile Robot Navigation. In International Conference on Field and Service Robotics, 2003. (pdf) (bibtex entry)
[bibtex-key = vandapel-fsr-03]

Jerome Vignola, Jean-Francois Lalonde, and Robert Bergevin. Progressive Human Skeleton Fitting. In Proceedings of the 16th Conference on Vision Interface, 2003.

Annotation: {This paper proposes a method to fit a skeleton or stick-model to a blob to determine the pose of a person in an image. The input is a binary image representing the silhouette of a person and the ouput is a stick-model coherent with the pose of the person in this image. A torso model is first defined, and is then scaled and fitted to the blob using the distance transform of the original image. Then, the fitting is performed independently for each of the four limbs (two arms, two legs), using again the distance transform. The fact that each limb is fitted independently speeds-up the fitting process, avoiding the combinatorial complexity problems that are frequent with this type of method.} url="http://vision.gel.ulaval.ca/fr/publications/Id_444/PublDetails.php" keywords= "pose recognition" .

(bibtex entry)
[bibtex-key = vignola-vi-03]

2002

Daniel Huber. Automatic Three-dimensional Modeling from Reality. PhD thesis, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, December 2002.
Keywords: 3-D perception, geometric modeling, registration, surface matching, automatic modeling. (bibtex entry)
[bibtex-key = Huber_2002_4190]
Henry Schneiderman and Takeo Kanade. Object Detection Using the Statistics of Parts. International Journal of Computer Vision, 2002. (url)
Abstract: "In this paper we describe a trainable object detector and its instantiations for detecting faces and cars at any size, location, and pose. To cope with variation in object orientation, the detector uses multiple classifiers, each spanning a different range of orientation. Each of these classifiers determines whether the object is present at a specified size within a fixed-size image window. To find the object at any location and size, these classifiers scan the image exhaustively. Each classifier is based on the statistics of localized parts. Each part is a transform from a subset of wavelet coefficients to a discrete set of values. Such parts are designed to capture various combinations of locality in space, frequency, and orientation. In building each classifier, we gathered the class-conditional statistics of these part values from representative samples of object and non-object images. We trained each classifier to minimize classification error on the training set by using Adaboost with Confidence-Weighted Predictions (Shapire and Singer, 1999). In detection, each classifier computes the part values within the image window and looks up their associated class-conditional probabilities. The classifier then makes a decision by applying a likelihood ratio test. For efficiency, the classifier evaluates this likelihood ratio in stages. At each stage, the classifier compares the partial likelihood ratio to a threshold and makes a decision about whether to cease evaluation ¾ labeling the input as non-object - or to continue further evaluation. The detector orders these stages of evaluation from a low-resolution to a high-resolution search of the image. Our trainable object detector achieves reliable and efficient detection of human faces and passenger cars with out-of-plane rotation." (bibtex entry)
[bibtex-key = Schneiderman_2002_3977]
Anthony (Tony) Stentz, Cristian Dima, Carl Wellington, Herman Herman, and David Stager. A System for Semi-Autonomous Tractor Operations. Autonomous Robots, 13(1):87-103, July 2002. (bibtex entry)
[bibtex-key = Stentz_2002_4583]
Owen Carmichael and Martial Hebert. Object Recognition by a Cascade of Edge Probes. In British Machine Vision Conference 2002, volume 1, pages 103-112, September 2002. British Machine Vision Association. (url) (pdf)
Keywords: object recognition, computer vision.
Abstract: "We frame the problem of object recognition from edge cues in terms of determining whether individual edge pixels belong to the target object or to clutter, based on the configuration of edges in their vicinity. A classifier solves this problem by computing sparse, localized edge features at image locations determined at training time. In order to save computation and solve the aperture problem, we apply a cascade of these classifiers to the image, each of which computes edge features over larger image regions than its predecessors. Experiments apply this approach to the recognition of real objects with holes and wiry components in cluttered scenes under arbitrary out-of-image-plane rotation." (bibtex entry)
[bibtex-key = Carmichael_2002_4061]
Peng Chang, Mei Han, and Yihong Gong. Highlight detection and classification of baseball game video with Hidden Markov Models. In Proceedings of the International Conference on Image Processing (ICIP '02), 2002. (bibtex entry)
[bibtex-key = Chang_2002_4047]
Peng Chang and Martial Hebert. Robust tracking and structure from motion through sampling based uncertainty representation. In Proceedings of ICRA '02, May 2002. (pdf) (bibtex entry)
[bibtex-key = Chang_2002_446]
Cristian Dima and Simon Lacroix. Using Multiple Disparity Hypotheses for Improved Indoor Stereo. In International Conference on Robotics and Automation, May 2002. IEEE.
Note: This publication is based on work performed at LAAS-CNRS in Toulouse, France (June-August 2001). (pdf) (bibtex entry)
[bibtex-key = Dima_2002_4682]
Martial Hebert, Nicolas Vandapel, Stefan Keller, and Raghavendra Rao Donamukkala. Evaluation and Comparison of Terrain Classification Techniques from LADAR Data for Autonomous Navigation. In 23rd Army Science Conference, December 2002. (bibtex entry)
[bibtex-key = Hebert_2002_4108]
Sanjiv Kumar, Alex C. Loui, and Martial Hebert. Probabilistic Classification of Image Regions using an Observation-Constrained Generative Approach. In ECCV Workshop on Generative Models based Vision (GMBV), pages 91 - 99, 2002. (pdf) (bibtex entry)
[bibtex-key = Kumar_2002_4593]
Shyjan Mahamud, Martial Hebert, and John Lafferty. Combining Simple Discriminators for Object Discrimination. In European Conf. on Computer Vision (ECCV), 2002. (url) (pdf)
Abstract: "We propose to combine simple discriminators for object discrimination under the maximum entropy framework or equivalently under the maximum likelihood framework for the exponential family. The duality between the maximum entropy framework and maximum likelihood framework allows us to relate two selection criteria for the discriminators that were proposed in the literature. We illustrate our approach by combining nearest prototype discriminators that are simple to implement and widely applicable as they can be constructed in any feature space with a distance function. For efficient run-time performance we adapt the work on ``alternating trees'' for multi-class discrimination tasks. We report results on a multi-class discrimination task in which significant gains in performance are seen by combining discriminators under our framework from a variety of easy to construct feature spaces." (bibtex entry)
[bibtex-key = Mahamud_2002_4709]
Bart Nabbe and Martial Hebert. Toward Practical Cooperative Stereo for Robotic Colonies. In 2002 IEEE International Conference on Robotics and Automation, volume 4, pages 3328-3335, May 2002. Omnipress. (pdf)
Keywords: wide baseline stereo, affine invariants, robust epipolar estimation. (bibtex entry)
[bibtex-key = Nabbe_2002_4060]
Charles Rosenberg and Martial Hebert. Training Object Detection Models with Weakly Labeled Data. In British Machine Vision Conference, September 2002. (pdf) (bibtex entry)
[bibtex-key = Rosenberg_2002_4578]
Ranjith Unnikrishnan and Alonzo Kelly. Mosaicing Large Cyclic Environments for Visual Navigation in Autonomous Vehicles. In IEEE International Conference on Robotics and Automation, 2002 (ICRA '02), volume 4, pages 4299-4306, May 2002. (bibtex entry)
[bibtex-key = Unnikrishnan_2002_4044]
Ranjith Unnikrishnan and Alonzo Kelly. A Constrained Optimization Approach to Globally Consistent Mapping. In 2002 IEEE/RSJ Intl. Conference on Intelligent Robots and Systems (IROS '02), volume 1, pages 564-569, October 2002. (bibtex entry)
[bibtex-key = Unnikrishnan_2002_4098]
Owen Carmichael. Discriminant Filters for Object Recognition. Technical report CMU-RI-TR-02-09, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, March 2002. (url) (pdf)
Keywords: object recognition, computer vision, machine learning.
Abstract: "This paper presents a technique for using training data to design image filters for appearance-based object recognition. Rather than scanning the image with a single set of filters and using the results to test for the existence of objects, we use many sets of filters and take linear combinations of their outputs. The combining coefficients are optimized in a training phase to encourage discriminability between the filter responses for distinct parts of the object and clutter. Our experiments on three popular filter types show that by using this approach to combine sets of filters whose design parameters vary over a wide range, we can achieve detection performance competitive with that of any individual filter set. This in turn can ease the task of fine-tuning the settings for both the filters and the mechanisms that analyze their outputs." (bibtex entry)
[bibtex-key = Carmichael_2002_3954]
Ranjith Unnikrishnan. Globally Consistent Mosaicking for Autonomous Visual Navigation. Master's thesis, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, September 2002. (bibtex entry)
[bibtex-key = Unnikrishnan_2002_4063]

2001

Daniel Huber. Automatic 3-D Modeling Using Range Images Obtained from Unknown Viewpoints. In Proceedings of the Third International Conference on 3-D Digital Imaging and Modeling, pages 153-160, May 2001. IEEE Computer Society.
Keywords: 3-D modeling, registration, surface matching, automatic modeling. (bibtex entry)
[bibtex-key = Huber_2001_3746]
Daniel Huber and Martial Hebert. Fully Automatic Registration of Multiple 3-D Data Sets. In IEEE Computer Society Workshop on Computer Vision Beyond the Visible Spectrum(CVBVS 2001), December 2001. (pdf)
Keywords: 3-D perception, geometric modeling, 3-D modeling, registration, surface matching, automatic modeling. (bibtex entry)
[bibtex-key = Huber_2001_3886]
Shyjan Mahamud, Martial Hebert, and Jianbo Shi. Object Recognition using Boosted Discriminants. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '01), December 2001. (url) (pdf)
Abstract: "We approach the task of object discrimination as that of learning efficient "codes" for each object class in terms of responses to a set of chosen discriminants. We formulate this approach in an energy minimization framework. The "code" is built incrementally by successively constructing discriminants that focus on pairs of training images of objects that are currently hard to classify. The particular discriminants that we use partition the set of objects of interest into two well-separated groups. We find the optimal discriminant as well as partition by formulating an objective criteria that measures the well-separateness of the partition. We derive an iterative solution that alternates between the solutions for two generalized eigenproblems, one for the discriminant parameters and the other for the indicator variables denoting the partition. We show how the optimization can easily be biased to focus on hard to classify pairs, which enables us to choose new discriminants one by one in a sequential manner. We validate our approach on a challenging face discrimination task using parts as features and show that it compares favorably with the performance of an eigenspace method." (bibtex entry)
[bibtex-key = Mahamud_2001_4385]
Chuck Rosenberg, Martial Hebert, and Sebastian Thrun. Color constancy using KL-divergence. In Proceedings of the Eighth IEEE International Conference on Computer Vision (ICCV '01), volume 1, pages 239 - 246, July 2001. (pdf) (bibtex entry)
[bibtex-key = Rosenberg_2001_3825]
Naoya Takao, Jianbo Shi, Simon Baker, Iain Matthews, and Bart Nabbe. Tele-Graffiti: A Paper-Based Remote Sketching System. In Proceedings of the 8th IEEE International Conference on Computer Vision, Vancouver, British Columbia, July 2001.
Keywords: teleconference, pen and paper human computer interface, interactive desktop, camera projector system. (bibtex entry)
[bibtex-key = Takao_2001_3716]

2000

Henry Schneiderman. A Statistical Approach to 3-D Object Detection Applied to Faces and Cars. PhD thesis, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, May 2000. (url)
Abstract: "In this thesis, we describe a statistical method for 3-D object detection. In this method, we decompose the 3-D geometry of each object into a small number of viewpoints. For each viewpoint, we construct a decision rule that determines if the object is present at that specific orientation. Each decision rule uses the statistics of both object appearance and "non-object" visual appearance. We represent each set of statistics using a product of histograms. Each histogram represents the joint statistics of a subset of wavelet coefficients and their position on the object. Our approach is to use many such histograms representing a wide variety of visual attributes. Using this method, we have developed the first algorithm that can reliably detect faces that vary from frontal view to full profile view and the first algorithm that can reliably detect cars over a wide range of viewpoints." (bibtex entry)
[bibtex-key = Schneiderman_2000_3332]
Peng Chang and Martial Hebert. Omni-directional structure from motion. In Proceedings of the 2000 IEEE Workshop on Omnidirectional Vision, pages 127 - 133, June 2000. (pdf) (bibtex entry)
[bibtex-key = Chang_2000_3597]
Matthew Deans and Martial Hebert. Experimental Comparison of Techniques for Localization and Mapping using a Bearings Only Sensor. In Proc. of the ISER '00 Seventh International Symposium on Experimental Robotics, December 2000. (pdf) (bibtex entry)
[bibtex-key = Deans_2000_3453]
Matthew Deans and Martial Hebert. Invariant filtering for simultaneous localization and mapping. In IEEE International Conference on Robotics and Automation, pages 1042-7, April 2000. (pdf) (bibtex entry)
[bibtex-key = Deans_2000_3455]
Martial Hebert. Active and passive range sensing for robotics. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA '00), volume 1, pages 102 - 110, April 2000. (pdf) (bibtex entry)
[bibtex-key = Hebert_2000_3595]
Daniel Huber, Owen Carmichael, and Martial Hebert. 3-D map reconstruction from range data. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA '00), volume 1, pages 891 - 897, April 2000. (url) (pdf)
Keywords: 3-D perception, geometric modeling.
Abstract: "We present techniques for building models of complex environments from range data gathered at multiple viewpoints. The challenges in this problem are: the matching of unregistered views without prior knowledge of pose, the use of very large data sets, and the manipulation of data sets of different resolutions and from different sensors. Our approach is unique in that no prior knowledge of the relative viewpoints is needed in order to register the data. We show results in building maps of interior environment from range finding data, building large terrain maps from ground-based and from aerial data, and from an operational for mapping from stereo data for hazardous environment characterization. The paper summarizes the major results obtained so far in this area." (bibtex entry)
[bibtex-key = Huber_2000_3556]
Shyjan Mahamud and Martial Hebert. Iterative projective reconstruction from multiple views. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '00), volume 2, pages 430 - 437, June 2000. (url) (pdf)
Abstract: "We propose an iterative method for the recovery of the projective structure and motion from multiple images. It has been recently noted that by scaling the measurement matrix by the true projective depths, recovery of the structure and motion is possible by factorization. The reliable determination of the projective depths is crucial to the success of this approach. The previous approach recovers these projective depths using pairwise constraints among images. We first discuss a few important drawbacks with this approach. We then propose an iterative method where we simultaneously recover both the projective depths as well as the structure and motion that avoids some of these drawbacks by utilizing all of the available data uniformly. The new approach makes use of a subspace constraint on the projections of a 3-D point onto an arbitrary number of images. The projective depths are readily determined by solving a generalized eigenvalue problem derived from the subspace constraint. We also formulate a dual subspace constraint on all the points in a given image, which can be used for verifying the projective geometry of a scene or object that was modeled. We prove the monotonic convergence of the iterative scheme to a local maximum. We show the robustness of the approach on both synthetic and real data despite large perspective distortions and varying initializations." (bibtex entry)
[bibtex-key = Mahamud_2000_3596]
Henry Schneiderman and Takeo Kanade. A Statistical Model for 3-D Object Detection Applied to Faces and Cars. In IEEE Conference on Computer Vision and Pattern Recognition, June 2000. IEEE. (url)
Abstract: "In this paper, we describe a statistical method for 3-D object detection. We represent the statistics of both object appearance and "non-object" appearance using a product of histograms. Each histogram represents the joint statistics of a subset of wavelet coefficients and their position on the object. Our approach is to use many such histograms representing a wide variety of visual attributes. Using this method, we have developed the first algorithm that can reliably detect human faces with out-of-plane rotation and the first algorithm that can reliably detect passenger cars over a wide range of viewpoints" (bibtex entry)
[bibtex-key = Schneiderman_2000_3294]
Henry Schneiderman and Takeo Kanade. A histogram-based method for detection of faces and cars. In Proceedings of the 2000 International Conference on Image Processing (ICIP '00), volume 3, pages 504 - 507, September 2000. (url)
Abstract: "We describe a statistical method for 3-D object detection. We represent the statistics of both object appearance and "non-object" appearance using a product of histograms. Each histogram represents the joint statistics of a subset of wavelet coefficients and their position on the object. Our approach is to use many such histograms representing a wide variety of visual attributes. Using this method, we have developed the first algorithm that can reliably detect human faces that vary from frontal view to full profile view and the first algorithm that can reliably detect cars over a wide range of viewpoints." (bibtex entry)
[bibtex-key = Schneiderman_2000_3530]
Scott Thayer, Bruce Digney, M Bernardine Dias, Anthony (Tony) Stentz, Bart Nabbe, and Martial Hebert. Distributed Robotic Mapping of Extreme Environments. In Proceedings of SPIE: Mobile Robots XV and Telemanipulator and Telepresence Technologies VII, volume 4195, November 2000. (pdf)
Keywords: multi-robot coordination, robotic Mapping, mobile robots. (bibtex entry)
[bibtex-key = Thayer_2000_3506]

1999

John Hancock. Laser Intensity-Based Obstacle Detection and Tracking. PhD thesis, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, January 1999. (bibtex entry)
[bibtex-key = Hancock_1999_527]
Dongmei Zhang. Harmonic Shape Images: A 3-D Free-form Surface Representation and Its Applications in Surface Matching. PhD thesis, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, November 1999. (bibtex entry)
[bibtex-key = Zhang_1999_3357]
Andrew Johnson and Martial Hebert. Using spin images for efficient object recognition in cluttered 3-D scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(5):433 - 449, May 1999. (pdf) (bibtex entry)
[bibtex-key = Johnson_1999_3598]
Theodore T. Blackmon, Scott Thayer, James Teza, Vincent Broz, James Osborn, and Martial Hebert. Virtual Reality Mapping System for Chernobyl Accident Site Assessment. In Proceedings of the SPIE, volume 3644, pages 338-345, February 1999.
Abstract: "Initiated by the Department of Energy's International Nuclear Safety Program, an effort is underway to deliver and deploy a telerobotic diagnostic system for structural evaluation and monitoring within the Chernobyl Unit-4 shelter. A mobile robot, named Pioneer, will enter the damaged Chernobyl structure and deploy devices to measure radiation, temperature, and humidity; acquire core samples of concrete structures for subsequent engineering analysis; and make photo-realistic three-dimensional (3-D) maps of the building interior. This paper details the later element, dubbed "C-Map", the Chernobyl Mapping System. C-Map consists of an automated 3-D modeling system using stereo computer vision along with an interactive, virtual reality (VR) software program to acquire and analyze the photo-realistic 3-D maps of the damaged building interior." (bibtex entry)
[bibtex-key = Blackmon_1999_3747]
Vincent Broz, Owen Carmichael, Scott Thayer, James Osborn, and Martial Hebert. ARTISAN: An Integrated Scene Mapping and Object Recognition System. In American Nuclear Society 8th Intl. Topical Meeting on Robotics and Remote Systems, April 1999. American Nuclear Society. (url) (pdf)
Keywords: 3-D perception, geometric modeling.
Abstract: "Integration of three-dimensional textured scene mapping and object recognition presents many opportunities for assisted automation. We present Artisan, a software package that synthesizes these elements to form a user-friendly whole. Artisan uses a variety of 3-D sensors, including laser range scanners and stereo systems, to acquire both image and range data. Artisan automatically finds the transformations between data taken at multiple sensor viewpoints using matching algorithms. The data from these viewpoints are then merged together to form an integrated textured map of the entire scene. Other user or sensor input can also inserted into the scene. Using object recognition with an expandable library of objects, Artisan can identify and locate simple and complex scene features. With this identity and transformation information, it is able to support many operations, including semi-automatic robotic teleoperation and navigation. After mapping and recognition, the identity, position, and orientation of the objects in the scene can be automatically transferred from the Artisan system into other software, including robotic teleoperation packages. Numerous opportunities for automation exist during the operations stage as a result of this increased world knowledge." (bibtex entry)
[bibtex-key = 1999-broz-amns]
Owen Carmichael and Martial Hebert. 3-D Cueing: A Data Filter For Object Recognition. In IEEE Conference on Robotics and Automation (ICRA '99), volume 2, pages 944 - 950, May 1999. (url) (pdf)
Keywords: 3-D perception, geometric modeling.
Abstract: "Presents a method for quickly filtering range data points to make object recognition in large 3-D data sets feasible. The general approach, called "3-D cueing", uses shape signatures from object models as the basis for a fast, probabilistic classification system which rates scene points in terms of their likelihood of belonging to a model. This algorithm which could be used as a front-end for any traditional 3-D matching technique, is demonstrated using several models and cluttered scenes in which the model occupies between 1 percent and 50 percent of the data points." (bibtex entry)
[bibtex-key = Carmichael_1999_2967]
Owen Carmichael, Daniel Huber, and Martial Hebert. Large Data Sets and Confusing Scenes in 3-D surface Matching and Recognition. In Proceedings of the Second International Conference on 3-D Digital Imaging and Modeling (3-DIM'99), pages 358-367, October 1999. (url) (pdf)
Keywords: 3-D perception, geometric modeling.
Abstract: "In this paper, we report on recent extensions to a surface matching algorithm based on local 3-D signatures. This algorithm was previously shown to be effective in view registration of general surfaces and in object recogni-tion from 3-D model data bases. We describe extensions to the basic matching algorithm which will enable it to address several challenging, and often overlooked, problems encountered with real data. First, we describe extensions that allow us to deal with data sets with large variations in resolution and with large data sets for which computational efficiency is a major issue. The applicability of the enhanced matching algorithm is illustrated by an example application: the construction of large terrain maps and the construction of accurate 3-D models from unregistered views. Second, we describe extensions that facilitate the use of 3-D object recognition in cases in which the scene con-tains a large amount of clutter (e.g., the object occupies 1 of the scene) and in which the scene presents a high degree of confusion (e.g., the model shape is close to other shapes in the scene.) Those last two extensions involve learning recognition strategies from the descrip-tion of the model and from the performance of the recog-nition algorithm using Bayesian and memory-based learning techniques, respectively." (bibtex entry)
[bibtex-key = Carmichael_1999_3196]
Peng Chang and John Krumm. Object Recognition with Color Cooccurrence Histogram. In Proceedings of CVPR '99, 1999. (bibtex entry)
[bibtex-key = Chang_1999_2657]
Martial Hebert, Robert MacLachlan, and Peng Chang. Experiments with Driving Modes for Urban Robots. In Proceedings of SPIE, 1999. (pdf) (bibtex entry)
[bibtex-key = Hebert_1999_4045]
Daniel Huber and Martial Hebert. A New Approach to 3-D Terrain Mapping. In Proceedings of the 1999 IEEE/RSJ International Conference on Intelligent Robotics and Systems (IROS '99), pages 1121-1127, October 1999. IEEE. (pdf)
Keywords: Terrain modeling, 3-D modeling, registration. (bibtex entry)
[bibtex-key = Huber_1999_3191]
P. Lee, W. A. Cassidy, D. Apostolopoulos, D. Bassi, L. Bravo, H. Cifuentes, M. Deans, A. Foessel, S. Moorehead, M. Parris, C. Puebla, Liam Pedersen, M. Sibenac, F. Valdes, N. Vandapel, and W. Whittaker. Search for Meteorites at Martin Hills and Pirrit Hills, Antarctica. In Lunar and Planetary Science Conference XXX, 1999. (bibtex entry)
[bibtex-key = Lee_1999_3113]
Shyjan Mahamud and Martial Hebert. Efficient Recovery of Low-dimensional Structure from High-dimensional Data. In IEEE International Conference on Computer Vision (ICCV), September 1999. IEEE. (url) (pdf)
Abstract: "Many modeling tasks in computer vision. e.g. structure from motion, shape/reflectance from shading, filter synthesis have a low-dimensional intrinsic structure even though the dimension of the input data can be relatively large. We propose a simple but surprisingly effective iterative randomized algorithm that drastically reduces the time required for recovering the intrinsic structure. The computational cost depends only on the intrinsic dimension of the structure of the task. It is based on the recently proposed Cascade Basis Reduction (CBR) algorithm that was developed in the context of steerable filters. A key feature of our algorithm compared with CBR is that an arbitrary apriori basis for the task is not required. This allows us to extend the applicability of the algorithm to tasks beyond steerable filters. We prove the convergence for the new algorithm and show that in practice the new algorithm is much faster than CBR for the same modeling error. We demonstrate this speed-up for the construction of a steerable basis for Gabor filters. We also demonstrate the generality of the new algorithm by applying it to to an example from structure from motion without missing data." (bibtex entry)
[bibtex-key = Mahamud_1999_2606]
Shyjan Mahamud, Karvel K. Thornber, and Lance R. Williams. Segmentation of Salient Closed Contours from Real Images. In IEEE International Conference on Computer Vision (ICCV), September 1999. IEEE. (url) (pdf)
Abstract: "Using a saliency measure based on the global property of contour closure, we have developed a method that reliably segments out salient contours bounding unknown objects from real edge images. The measure also incorporates the Gestalt principles of proximity and smooth continuity that previous methods have exploited. Unlike previous measures, we incorporate contour closure by finding the eigen-solution associated with a stochastic process that models the distribution of contours passing through edges in the scene. The segmentation algorithm utilizes the saliency measure to identify multiple closed contours by finding strongly-connected components on an induced graph. The determination of strongly-connected components is a direct consequence of the property of closure. We report for the first time, results on large real images for which segmentation takes an average of about 10 secs per object on a general-purpose workstation. The segmentation is made efficient for such large images by exploiting the inherent symmetry in the task." (bibtex entry)
[bibtex-key = Mahamud_1999_2607]
Dongmei Zhang and Martial Hebert. Harmonic Maps and Their Applications in Surface Matching. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR '99), volume 2, 1999. (pdf) (bibtex entry)
[bibtex-key = Zhang_1999_3231]
Dongmei Zhang and Martial Hebert. Experimental analysis of Harmonic Shape Images. In Proceedings of the Second International Conference on 3-D Digital Imaging and Modeling, pages 209 - 218, October 1999. (pdf) (bibtex entry)
[bibtex-key = Zhang_1999_3599]

1998

Rahul Sukthankar, Shumeet Baluja, and John Hancock. Multiple Adaptive Intelligent Agents for Tactical Driving. International Journal of Applied Intelligence, 1998. (bibtex entry)
[bibtex-key = Sukthankar_1998_868]
Barry Brummit and Martial Hebert. Experiments in Autonomous Driving with Concurrent Goals and Multiple Vehicles. In Proceedings of the 1998 IEEE International Conference on Robotics and Automation (ICRA '98), volume 3, pages 1895 - 1902, May 1998. (pdf) (bibtex entry)
[bibtex-key = Brummit_1998_1201]
Owen Carmichael and Martial Hebert. Unconstrained Registration of Large 3-D Point Sets for Complex Model Building. In Proceedings 1998 IEEE/RSJ International Conference On Intelligent Robotic Systems (IROS '98), volume 1, pages 360 - 367, October 1998. (url) (pdf)
Keywords: 3-D perception, geometric modeling.
Abstract: "We present a method for building models of complex environments from range data gathered at multiple viewpoints. Our approach is unique in that no prior knowledge of the relative positions of the viewpoints is needed in order to register data from them. Furthermore, we present a technique for specification and utilization of so-called "common-sense" constraints on the transformations between views to improve the accuracy and speed of the registration process. Results are shown from our effort to map a 60 m by 20 m multiple-room storage area containing a cluttered array of objects." (bibtex entry)
[bibtex-key = Carmichael_1998_1293]
Peng Chang and Martial Hebert. Omni-directional Visual Servoing for Human-Robot interaction. In Proceedings of the 1998 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '98), volume 3, pages 1801 - 1807, October 1998. (pdf) (bibtex entry)
[bibtex-key = Chang_1998_1296]
Matthew Deans, Stewart Moorehead, Benjamin Shamah, Kimberly Shillcutt, and William Red L. Whittaker. A Concept for Robotic Lunar South Pole Exploration. In Proceedings of the Sixth International Conference and Exposition on Engineering, Construction, and Operations in Space (Space '98), pages 333-339, April 1998. (bibtex entry)
[bibtex-key = Deans_1998_857]
L.J. Denes, M. Gottlieb, B. Kaminsky, and Daniel Huber. Spectro-Polarimetric Imaging for Object Recognition. In Proceedings of Applied Imagery Pattern Recognition (AIPR '97), volume 3240, pages 8-18, 1998. SPIE.
Keywords: Multi-spectral imaging, object recognition, Acousto-optical tunable filters. (bibtex entry)
[bibtex-key = Denes_1998_539]
John Hancock, Martial Hebert, and Charles Thorpe. Laser Intensity-Based Obstacle Detection. In Proceedings 1998 IEEE/RSJ International Conference On Intelligent Robotic Systems (IROS '98), volume 3, pages 1541 - 1546, October 1998. (pdf) (bibtex entry)
[bibtex-key = Hancock_1998_867]
John Hancock, Dirk Langer, Martial Hebert, R. Sullivan, D. Ingimarson, E. Hoffman, M. Mettenleiter, and C. Froehlich. Active laser radar for high-performance measurements. In Proceedings of the 1998 IEEE International Conference on Robotics and Automation (ICRA '98), volume 2, pages 1465 - 1470, May 1998. (pdf) (bibtex entry)
[bibtex-key = Hancock_1998_3600]
Martial Hebert, Anthony (Tony) Stentz, and Charles Thorpe. Mobility Planning for Autonomous Navigation of Multiple Robots in Unstructured Environments. In Proceedings of ISIC/CIRA/ISAS Joint Conference on the Science and Technology of Intelligent Systems, pages 652 - 657, September 1998. (pdf) (bibtex entry)
[bibtex-key = Hebert_1998_2278]
Martial Hebert. Shape Recognition: Recent Techniques and Applications. In Proceedings of RFIA '98, 1998. (pdf) (bibtex entry)
[bibtex-key = Hebert_1998_4254]
Prem Janardhan, Martial Hebert, and Katsushi Ikeuchi. The space-time map applied to Drosophila embryogenesis. In Proceedings of the Workshop on Biomedical Image Analysis, pages 144 - 153, June 1998. (pdf) (bibtex entry)
[bibtex-key = Janardhan_1998_3602]
Andrew Johnson, Owen Carmichael, Daniel Huber, and Martial Hebert. Toward a General 3-D Matching Engine: Multiple Models, Complex Scenes, and Efficient Data Filtering. In Proceedings of the 1998 Image Understanding Workshop (IUW), pages 1097-1107, November 1998. (url) (pdf)
Keywords: 3-D perception, geometric modeling.
Abstract: "We present a 3-D shape-based object recognition system for simultaneous recognition of multiple objects in scenes containing clutter and occlusion. Recognition is based on matching surfaces by matching points using the spin-image representation. The spin-image is a data level shape descriptor that is used to match surfaces represented as surface meshes. Starting with the general matching framework introduced earlier, we present a compression scheme for spin-images; this scheme results in efficient multiple object recognition which we verify with results showing the simultaneous recognition of multiple objects from a library of 20 models. In addition, we demonstrate the robust performance of recognition in the presence of clutter and occlusion through analysis of recognition trials on 100 scenes. We address efficiency and generality through two extensions to the basic matching scheme: fast filtering of scene points and processing of general data sets." (bibtex entry)
[bibtex-key = Johnson_1998_2279]
Andrew Johnson and Martial Hebert. Efficient multiple model recognition in cluttered 3-D scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '98), pages 671 - 677, June 1998. (pdf) (bibtex entry)
[bibtex-key = Johnson_1998_3601]
Henry Schneiderman and Takeo Kanade. Probabilistic Modeling of Local Appearance and Spatial Relationships for Object Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '98), pages 45-51, July 1998. (url)
Abstract: "In this paper, we describe an algorithm for object recognition that explicitly models and estimates the posterior probability function, P(object|image). We have chosen a functional form of the posterior probability function that captures the joint statistics of local appearance and position on the object as well as the statistics of local appearance in the visual world at large. We use a discrete representation of local appearance consisting of approximately 1,000,000 patterns. We compute an estimate of P(object|image) in closed form by counting the frequency of occurrence of these patterns over various sets of training images. We have used this method for detecting human faces from frontal and profile views. The algorithm for frontal views has shown a detection rate of 93.0 percent with 88 false alarms on a set of 125 images containing 483 faces combining the MIT test set of Sung and Poggio with the CMU test sets of Rowley, Baluja, and Kanade. The algorithm for detection of profile views has also demonstrated promising results" (bibtex entry)
[bibtex-key = Schneiderman_1998_543]
Yutaka Takeuchi and Martial Hebert. Finding Images of Landmarks in Video Sequences. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '98), June 1998. (pdf) (bibtex entry)
[bibtex-key = Takeuchi_1998_3133]
Victor S.H. Wen, Owen Carmichael, Hiroshi Yamashita, and Andrew R. Neureuther. Rigorous Simulation of Statistical Electron-Electron Interactions with Fast Multipole Acceleration and a Network of Workstations. In Proceedings 1998 International Conference On Electron, ion and Photon Beam Technology and Nanofabrication, 1998. (url) (bibtex entry)
[bibtex-key = Wen_1998_1294]
Dongmei Zhang, Martial Hebert, Andrew Johnson, and Yanxi Liu. On Generating Multi-resolution Representations of Polygonal Meshes. In Proc. ICCV '98, 1998. (pdf) (bibtex entry)
[bibtex-key = Zhang_1998_1184]
Yutaka Takeuchi and Martial Hebert. Evaluation of Image-Based Landmark Recognition Techniques. Technical report CMU-RI-TR-98-20, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, July 1998. (pdf) (bibtex entry)
[bibtex-key = Takeuchi_1998_484]
Bart Nabbe. A Language for Reconfigurable Robot Control. Master's thesis, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, September 1998.
Keywords: mobile robot control, reactive control, pose selection, action templates, control Language. (bibtex entry)
[bibtex-key = Nabbe_1998_485]

1997

Martial Hebert, Charles Thorpe, and Anthony (Tony) Stentz. Intelligent Unmanned Ground Vehicles: Autonomous Navigation Research at Carnegie Mellon. KluwerAcademic Publishers, 1997. (bibtex entry)
[bibtex-key = Hebert_1997_1611]
Heung-Yeung Shum, Martial Hebert, Katsushi Ikeuchi, and Raj Reddy. An integral approach to free-form object modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(12):1366 - 1370, December 1997. (pdf) (bibtex entry)
[bibtex-key = Shum_1997_3603]
L.J. Denes, M. Gottlieb, B. Kaminsky, and Daniel Huber. A Spectro-Polarimetric Imager for Scene Discrimination. In Proceedings of the International Symposium on Spectral Sensing Research (ISSSR), 1997.
Keywords: Multi-spectral imaging, object recognition, Acousto-optical tunable filters.. (bibtex entry)
[bibtex-key = Denes_1997_538]
G. Dissanayake, Martial Hebert, Anthony (Tony) Stentz, and H. Durrant-Whyte. Map Building and Terrain-Aided Localisation in an Underground Mine. In Proceedings of the Field and Service Robotics Conference, 1997. (bibtex entry)
[bibtex-key = Dissanayake_1997_1202]
Daniel Huber, Louis Denes, Martial Hebert, Milton Gottlieb, Boris Kaminsky, and Peter Metes. A Spectro-Polarimetric Imager for Intelligent Transportation Systems. In SPIE International Symposium on Intelligent Systems and Advanced Manufacturing, Intelligent Transportation Systems, volume 3207, pages 94-102, October 1997. SPIE. (pdf)
Keywords: Multi-spectral imaging, object recognition, Acousto-optical tunable filters. (bibtex entry)
[bibtex-key = Huber_1997_467]
Andrew Johnson and Martial Hebert. Recognizing objects by matching oriented points. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '97),, pages 684 - 689, June 1997. (pdf) (bibtex entry)
[bibtex-key = Johnson_1997_3604]
Andrew Johnson and Martial Hebert. Surface Registration by Matching Oriented Points. In International Conference on Recent Advances in 3-D Digital Imaging and Modeling, pages 121-128, May 1997. (pdf)
Abstract: "For registration of 3-D free-form surfaces we have developed a representation which requires no knowledge of the transformation between views. The representation comprises descriptive images associated with oriented points on the surface of an object. Constructed using single point bases, these images are data level shape descriptions that are used for efficient matching of oriented points. Correlation of images is used to establish point correspondences between two views; from these correspondences a rigid transformation that aligns the views is calculated. The transformation is then refined and verified using a modified iterative closest point algorithm. To demonstrate the generality of our approach, we present results from multiple sensing domains." (bibtex entry)
[bibtex-key = Johnson_1997_974]
Andrew Johnson, Regis Hoffman, James Osborn, and Martial Hebert. A System for Semi-automatic Modeling of Complex Environments. In International Conference on Recent Advances in 3-D Digital Imaging and Modeling, pages 213-220, May 1997. (pdf) (bibtex entry)
[bibtex-key = Johnson_1997_975]
Eric Krotkov, Martial Hebert, L. Henriksen, P. Levin, Mark Maimone, Reid Simmons, and James Teza. Evolution of a Prototype Lunar Rover: Addition to Laser-Based Hazard Detection, and Results from Field Trials in Lunar Analogue Terrain. In Autonomous Robots, volume 7, pages 119 - 130, July 1997.
Abstract: "This paper presents the results of field trials of a prototype lunar rover traveling over natural terrain under safeguarded teleoperation control. Both the rover and the safeguarding approach have been used in previous work. The original contributions of this paper are the development and integration of a laser hazard detection system, and extensive field testing of the overall system. The laser system, which complements an existing stereo vision system, is based on a line-scanning laser ranger viewing the area 1 meter in front of the rover. The laser system has demonstrated excellent performance: zero misses and few false alarms operating at 4 Hz. The overall safeguarding system guided the rover 43 km over lunar analogue terrain with 0.8 failures per kilometer." (bibtex entry)
[bibtex-key = Krotkov_1997_940]
Eric Krotkov, Martial Hebert, Lars Henriksen, Paul Levin, Mark Maimone, Reid Simmons, and James Teza. Field Trials of a Prototype Lunar Rover under Multi-Sensor Safeguarded Teleoperation Control. In American Nuclear Society Seventh Topical Meeting on Robotics & Remote Systems, pages 575 - 582, May 1997. (pdf)
Abstract: "This paper presents the results of field trials of a prototype lunar rover traveling over natural terrain under safeguarded teleoperation control. Both the rover and the safeguarding approach have been used in previous work. The original contributions of this paper are the development and integration of a laser sensing system, and extensive field testing of the overall system. The laser system, which complements an existing stereo vision system, is based on a line-scanning laser ranger viewing the area 1 meter in front of the rover. The laser system has demonstrated excellent performance: zero misses and few false alarms operating at 4 Hz. The overall safeguarding system guided the rover 43 km over lunar analogue terrain with 0.8 failures per kilometer." (bibtex entry)
[bibtex-key = Krotkov_1997_983]
Yutaka Takeuchi, Patrick Gros, Martial Hebert, and Katsushi Ikeuchi. Visual Learning for Landmark Recognition. In Image Understanding Workshop, May 1997. (pdf) (bibtex entry)
[bibtex-key = Takeuchi_1997_1002]
Dongmei Zhang and Martial Hebert. Multi-Scale Classification of 3-D Objects. In IEEE Conference on Computer Vision and Pattern Recognition, pages 864 - 869, June 1997. (pdf) (bibtex entry)
[bibtex-key = Zhang_1997_1007]
Matthew Deans, Gregory Fries, Keith Lay, Benjamin Shamah, Alex Foessel, Diana LaBelle, Stewart Moorehead, and Kimberly Shillcutt. Icebreaker: A Lunar South Pole Exploring Robot. Technical report CMU-RI-TR-97-22, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, June 1997. (bibtex entry)
[bibtex-key = Deans_1997_447]
Andrew Johnson and Martial Hebert. Control of Polygonal Mesh Resolution for 3-D Computer Vision. Technical report CMU-RI-TR-96-20, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, April 1997. (pdf)
Abstract: "A common representation in 3-D computer vision is the polygonal surface mesh because meshes can model objects of arbitrary shape and are easily constructed from sensed 3-D data. The resolution of a surface mesh is the overall spacing between vertices that comprise the mesh. Because sensed 3-D points are often unevenly distributed, the resolution of a surface mesh is often poorly defined. We present an algorithm that transforms a mesh with an uneven spacing between vertices into a mesh with a more even spacing between vertices, thus improving its definition of resolution. In addition, we show how the algorithm can be used to control the resolution of surface meshes, making them amenable to multi-resolution approaches in computer vision. The structure of our algorithm is modeled on iterative" (bibtex entry)
[bibtex-key = Johnson_1997_413]

BACK TO INDEX

The VMR Lab is part of the Vision and Autonomous Systems Center within the Robotics Institute in the School of Computer Science, Carnegie Mellon University.
This page was generated by a modified version of bibtex2html written by Gregoire Malandain