Vision and Mobile Robotics Laboratory | Publications
Home | Members | Projects | Publications | Software | Videos | Job opportunities Internal

BACK TO INDEX

Publications of year 2003
Thesis
  1. Owen Carmichael. Discriminative Techniques For The Recognition Of Complex-Shaped Objects. PhD thesis, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, September 2003. (url) (pdf)
    Keywords: object recognition, computer vision.
    Abstract: "This thesis presents new techniques which enable the automatic recognition of everyday objects like chairs and ladders in images of highly cluttered scenes. Given an image, we extract information about the shape and texture properties present in small patches of the image and use that information to identify parts of the objects we are interested in. We then assemble those parts into overall hypotheses about what objects are present in the image, and where they are. Solving this problem in a general setting is one of the central problems in computer vision, as doing so would have an immediate impact on a far-reaching set of applications in medicine, surveillance, manufacturing, robotics, and other areas. The central theme of this work is that formulating object recognition as a discrimination problem can ease the burden of system design. In particular, we show that thinking of recognition in terms of discriminating between objects and clutter, rather than separately modeling the appearances of objects and clutter, can simplify the processes of extracting information from the image and identifying which parts of the image correspond with parts of objects. The bulk of this thesis is concerned with recognizing "wiry" objects in highly-cluttered images; an example problem is finding ladders in images of a messy warehouse space. Wiry objects are distinguished by a prevalence of very thin, elongated, stick-like components; examples include tables, chairs, bicycles, and desk lamps. They are difficult to recognize because they tend to lack distinctive color or texture characteristics and their appearance is not easy to describe succinctly in terms of rectangular patches of image pixels. Here, we present a set of algorithms which extends current capabilities to find wiry objects in highly cluttered images across changes in the clutter and object pose. Specifically, we present discrimination-centered techniques for extracting shape features from portions of images, classifying those features as belonging to an object of interest or not, and aggregating found object parts together into overall instances of objects. Moreover, we present a suite of experiments on real, wiry objects­ a chair, cart, ladder, and stool respectively ­ which substantiates the utility of these methods and explores their behavior. The second part of the thesis presents a technique for extracting texture features from images in such a way that features from objects of interest are both well-clustered with each other and well-separated from the features from clutter. We present an optimization framework for automatically combining existing texture features into features that discriminate well, thus simplifying the process of tuning the parameters of the feature extraction process. This approach is substantiated in recognition experiments on real objects in real, cluttered images."
    @phdthesis{Carmichael_2003_4628,
    author = "Owen Carmichael",
    title = "Discriminative Techniques For The Recognition Of Complex-Shaped Objects",
    school = "Robotics Institute, Carnegie Mellon University",
    month = "September",
    year = "2003",
    address = "Pittsburgh, PA",
    url="http://www.ri.cmu.edu/pubs/pub_4628.html",
    pdf="http://www.ri.cmu.edu/pub_files/pub4/carmichael_owen_2003_2/carmichael_owen_2003_2.pdf",
    abstract="This thesis presents new techniques which enable the automatic recognition of everyday objects like chairs and ladders in images of highly cluttered scenes. Given an image, we extract information about the shape and texture properties present in small patches of the image and use that information to identify parts of the objects we are interested in. We then assemble those parts into overall hypotheses about what objects are present in the image, and where they are. Solving this problem in a general setting is one of the central problems in computer vision, as doing so would have an immediate impact on a far-reaching set of applications in medicine, surveillance, manufacturing, robotics, and other areas. The central theme of this work is that formulating object recognition as a discrimination problem can ease the burden of system design. In particular, we show that thinking of recognition in terms of discriminating between objects and clutter, rather than separately modeling the appearances of objects and clutter, can simplify the processes of extracting information from the image and identifying which parts of the image correspond with parts of objects. The bulk of this thesis is concerned with recognizing "wiry" objects in highly-cluttered images; an example problem is finding ladders in images of a messy warehouse space. Wiry objects are distinguished by a prevalence of very thin, elongated, stick-like components; examples include tables, chairs, bicycles, and desk lamps. They are difficult to recognize because they tend to lack distinctive color or texture characteristics and their appearance is not easy to describe succinctly in terms of rectangular patches of image pixels. Here, we present a set of algorithms which extends current capabilities to find wiry objects in highly cluttered images across changes in the clutter and object pose. Specifically, we present discrimination-centered techniques for extracting shape features from portions of images, classifying those features as belonging to an object of interest or not, and aggregating found object parts together into overall instances of objects. Moreover, we present a suite of experiments on real, wiry objects­ a chair, cart, ladder, and stool respectively ­ which substantiates the utility of these methods and explores their behavior. The second part of the thesis presents a technique for extracting texture features from images in such a way that features from objects of interest are both well-clustered with each other and well-separated from the features from clutter. We present an optimization framework for automatically combining existing texture features into features that discriminate well, thus simplifying the process of tuning the parameters of the feature extraction process. This approach is substantiated in recognition experiments on real objects in real, cluttered images.",
    keywords="object recognition, computer vision" 
    }

  2. Peng Chang. Robust Tracking and Structure from Motion with Sampling Method. PhD thesis, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, February 2003.
    @phdthesis{Chang_2003_4296,
    author = "Peng Chang",
    title = "Robust Tracking and Structure from Motion with Sampling Method",
    school = "Robotics Institute, Carnegie Mellon University",
    month = "February",
    year = "2003",
    address = "Pittsburgh, PA" 
    }

Journal articles or book chapters
  1. Daniel Huber and Martial Hebert. Fully automatic registration of multiple 3-D data sets. Image and Vision Computing, 21(7):637-650, July 2003.
    Keywords: geometric modeling, registration, surface matching, automatic modeling.
    @article{Huber_2003_4470,
    author = "Daniel Huber and Martial Hebert",
    title = "Fully automatic registration of multiple 3-D data sets",
    journal = "Image and Vision Computing",
    month = "July",
    year = "2003",
    volume = "21",
    number = "7",
    pages = "637-650",
    keywords="geometric modeling, registration, surface matching, automatic modeling" 
    }

  2. Sanjiv Kumar, Alex C. Loui, and Martial Hebert. An Observation-Constrained Generative Approach for Probabilistic Classification of Image Regions. Image and Vision Computing, 21:87-97, 2003. (pdf)
    @article{Kumar_2003_4594,
    author = "Sanjiv Kumar and Alex C. Loui and Martial Hebert",
    title = "An Observation-Constrained Generative Approach for Probabilistic Classification of Image Regions",
    journal = "Image and Vision Computing",
    year = "2003",
    volume = "21",
    pages = "87-97",
    pdf ="http://www.ri.cmu.edu/pub_files/pub4/kumar_sanjiv_2003_1/kumar_sanjiv_2003_1.pdf" 
    }

  3. Shyjan Mahamud, Lance R. Williams, Karvel K. Thornber, and Kanglin Xu. Segmentation of Multiple Salient Closed Contours from Real Images. IEEE Trans. on Pattern Analysis and Machine Intelligence, 25(4), April 2003. (url) (pdf)
    Abstract: "Using a saliency measure based on the global property of contour closure, we have developed a segmentation method which identifies smooth closed contours bounding objects of unknown shape in real images. The saliency measure incorporates the Gestalt principles of proximity and good continuity that previous methods have also exploited. Unlike previous methods, we incorporate contour closure by finding the eigenvector with the largest positive real eigenvalue of a transition matrix for a Markov process where edges from the image serve as states. Element (i, j) of the transition matrix is the conditional probability that a contour which contains edge j will also contain edge i. In this paper, we show how the saliency measure, defined for individual edges, can be used to derive a saliency relation, defined for pairs of edges, and further show that strongly-connected components of the graph representing the saliency relation correspond to smooth closed contours in the image. Finally, we report for the first time, results on large real images for which segmentation takes an average of about 10 seconds per object on a general-purpose workstation."
    @article{Mahamud_2003_4707,
    author = "Shyjan Mahamud and Lance R. Williams and Karvel K. Thornber and Kanglin Xu",
    title = "Segmentation of Multiple Salient Closed Contours from Real Images",
    journal = "IEEE Trans. on Pattern Analysis and Machine Intelligence",
    month = "April",
    year = "2003",
    volume = "25",
    number = "4",
    url = "http://www.ri.cmu.edu/pubs/pub_4707.html",
    pdf ="http://www.ri.cmu.edu/pub_files/pub4/mahamud_shyjan_2003_2/mahamud_shyjan_2003_2.pdf",
    abstract="Using a saliency measure based on the global property of contour closure, we have developed a segmentation method which identifies smooth closed contours bounding objects of unknown shape in real images. The saliency measure incorporates the Gestalt principles of proximity and good continuity that previous methods have also exploited. Unlike previous methods, we incorporate contour closure by finding the eigenvector with the largest positive real eigenvalue of a transition matrix for a Markov process where edges from the image serve as states. Element (i, j) of the transition matrix is the conditional probability that a contour which contains edge j will also contain edge i. In this paper, we show how the saliency measure, defined for individual edges, can be used to derive a saliency relation, defined for pairs of edges, and further show that strongly-connected components of the graph representing the saliency relation correspond to smooth closed contours in the image. Finally, we report for the first time, results on large real images for which segmentation takes an average of about 10 seconds per object on a general-purpose workstation." 
    }

Conference's articles
  1. Owen Carmichael and Martial Hebert. Shape-based Recognition Of Wiry Objects. In IEEE Conference On Computer Vision And Pattern Recognition, June 2003. IEEE Press. (url) (pdf)
    Keywords: object recognition, computer vision.
    Abstract: "We present an approach to the recognition of complex-shaped objects in cluttered environments based on edge cues. We first use example images of the desired object in typical backgrounds to train a classifier cascade which determines whether edge pixels in an image belong to an instance of the object or the clutter. Presented with a novel image, we use the cascade to discard clutter edge pixels. The features used for this classification are localized, sparse edge density operations. Experiments validate the effectiveness of the technique for recognition of complex objects in cluttered indoor scenes under arbitrary out-of-image-plane rotation."
    @inproceedings{Carmichael_2003_4386,
    author = "Owen Carmichael and Martial Hebert",
    title = "Shape-based Recognition Of Wiry Objects",
    booktitle = "IEEE Conference On Computer Vision And Pattern Recognition",
    month = "June",
    year = "2003",
    publisher = "IEEE Press",
    pdf = "http://www.ri.cmu.edu/pub_files/pub4/carmichael_owen_2003_1/carmichael_owen_2003_1.pdf",
    url="http://www.ri.cmu.edu/pubs/pub_4386.html",
    abstract="We present an approach to the recognition of complex-shaped objects in cluttered environments based on edge cues. We first use example images of the desired object in typical backgrounds to train a classifier cascade which determines whether edge pixels in an image belong to an instance of the object or the clutter. Presented with a novel image, we use the cascade to discard clutter edge pixels. The features used for this classification are localized, sparse edge density operations. Experiments validate the effectiveness of the technique for recognition of complex objects in cluttered indoor scenes under arbitrary out-of-image-plane rotation.",
    keywords="object recognition, computer vision" 
    }

  2. Cristian Dima, Nicolas Vandapel, and Martial Hebert. Sensor and Classifier Fusion for Outdoor Obstacle Detection: an Application of Data Fusion To Autonomous Off-Road Navigation. In The 32nd Applied Imagery Recognition Workshop (AIPR2003), October 2003. IEEE Computer Society. (pdf)
    @inproceedings{Dima_2003_4582,
    author = "Cristian Dima and Nicolas Vandapel and Martial Hebert",
    title = "Sensor and Classifier Fusion for Outdoor Obstacle Detection: an Application of Data Fusion To Autonomous Off-Road Navigation",
    booktitle = "The 32nd Applied Imagery Recognition Workshop (AIPR2003)",
    month = "October",
    year = "2003",
    publisher = "IEEE Computer Society",
    pdf ="http://www.ri.cmu.edu/pub_files/pub4/dima_cristian_2003_1/dima_cristian_2003_1.pdf" 
    }

  3. C. Gordon, F. Boukamp, Daniel Huber, Edward Latimer, K. Park, and B. Akinci. Combining Reality Capture Technologies for Construction Defect Detection: A Case Study. In EIA9: E-Activities and Intelligent Support in Design and the Built Environment, 9th EuropIA International Conference, pages 99-108, October 2003.
    Keywords: 3-D perception, geometric modeling, object recognition, defect detection, construction site modeling.
    @inproceedings{Gordon_2003_4670,
    author = "C. Gordon and F. Boukamp and Daniel Huber and Edward Latimer and K. Park and B. Akinci",
    title = "Combining Reality Capture Technologies for Construction Defect Detection: A Case Study",
    booktitle = "EIA9: E-Activities and Intelligent Support in Design and the Built Environment, 9th EuropIA International Conference",
    month = "October",
    year = "2003",
    pages = "99-108",
    keywords="3-D perception, geometric modeling, object recognition, defect detection, construction site modeling" 
    }

  4. Daniel Huber and Martial Hebert. 3-D Modeling Using a Statistical Sensor Model and Stochastic Search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 858-865, June 2003. (pdf)
    Keywords: 3-D perception, geometric modeling, 3-D modeling, registration, surface matching, automatic modeling.
    @inproceedings{Huber_2003_4427,
    author = "Daniel Huber and Martial Hebert",
    title = "3-D Modeling Using a Statistical Sensor Model and Stochastic Search",
    booktitle = "Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)",
    month = "June",
    year = "2003",
    pages = "858-865",
    pdf ="http://www.ri.cmu.edu/pub_files/pub4/huber_daniel_2003_2/huber_daniel_2003_2.pdf",
    keywords="3-D perception, geometric modeling, 3-D modeling, registration, surface matching, automatic modeling" 
    }

  5. Daniel Huber and Nicolas Vandapel. Automatic 3-D underground mine mapping. In International Conference on Field and Service Robotics, July 2003.
    Keywords: 3-D perception, geometric modeling, 3-D modeling, registration, mine mapping.
    @inproceedings{Huber_2003_4414,
    author = "Daniel Huber and Nicolas Vandapel",
    title = "Automatic 3-D underground mine mapping",
    booktitle = "International Conference on Field and Service Robotics",
    month = "July",
    year = "2003",
    keywords="3-D perception, geometric modeling, 3-D modeling, registration, mine mapping" 
    }

  6. Alonzo Kelly and Ranjith Unnikrishnan. Efficient Construction of Globally Consistent Ladar Maps using Pose Network Topology and Nonlinear Programming. In Proceedings of the 11th International Symposium of Robotics Research (ISRR '03), November 2003.
    @inproceedings{Kelly_2003_4587,
    author = "Alonzo Kelly and Ranjith Unnikrishnan",
    title = "Efficient Construction of Globally Consistent Ladar Maps using Pose Network Topology and Nonlinear Programming",
    booktitle = "Proceedings of the 11th International Symposium of Robotics Research (ISRR '03)",
    month = "November",
    year = "2003" 
    }

  7. Sanjiv Kumar and Martial Hebert. Discriminative Fields for Modeling Spatial Dependencies in Natural Images. In in proc. advances in Neural Information Processing Systems (NIPS), December 2003. (pdf)
    @inproceedings{Kumar_2003_4597,
    author = "Sanjiv Kumar and Martial Hebert",
    title = "Discriminative Fields for Modeling Spatial Dependencies in Natural Images",
    booktitle = "in proc. advances in Neural Information Processing Systems (NIPS)",
    month = "December",
    year = "2003",
    pdf = "http://www.ri.cmu.edu/pub_files/pub4/kumar_sanjiv_2003_3/kumar_sanjiv_2003_3.pdf" 
    }

  8. Sanjiv Kumar and Martial Hebert. Man-Made Structure Detection in Natural Images using a Causal Multiscale Random Field. In in proc. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), volume 1, pages 119-126, 2003. (pdf)
    @inproceedings{Kumar_2003_4595,
    author = "Sanjiv Kumar and Martial Hebert",
    title = "Man-Made Structure Detection in Natural Images using a Causal Multiscale Random Field",
    booktitle = "in proc. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR)",
    year = "2003",
    volume = "1",
    pages = "119-126",
    pdf ="http://www.ri.cmu.edu/pub_files/pub4/kumar_sanjiv_2003_2/kumar_sanjiv_2003_2.pdf" 
    }

  9. Sanjiv Kumar and Martial Hebert. Discriminative Random Fields: A Discriminative Framework for Contextual Interaction in Classification. In Proceedings of the 2003 IEEE International Conference on Computer Vision (ICCV '03), volume 2, pages 1150-1157, 2003. (pdf)
    @inproceedings{Kumar_2003_4596,
    author = "Sanjiv Kumar and Martial Hebert",
    title = "Discriminative Random Fields: A Discriminative Framework for Contextual Interaction in Classification",
    booktitle = "Proceedings of the 2003 IEEE International Conference on Computer Vision (ICCV '03)",
    year = "2003",
    volume = "2",
    pages = "1150-1157",
    pdf ="http://www.ri.cmu.edu/pub_files/pub4/kumar_sanjiv_2003_4/kumar_sanjiv_2003_4.pdf" 
    }

  10. Shyjan Mahamud and Martial Hebert. The Optimal Distance Measure for Object Detection. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2003. (url) (pdf)
    Abstract: "We develop a multi-class object detection framework whose core component is a nearest neighbor search over object part classes. The performance of the overall system is critically dependent on the distance measure used in the nearest neighbor search. A distance measure that minimizes the mis-classification risk for the 1-nearest neighbor search can be shown to be the probability that a pair of input image measurements belong to different classes. In practice, we model the optimal distance measure using a linear logistic model that combines the discriminative powers of more elementary distance measures associated with a collection of simple to construct feature spaces like color, texture and local shape properties. Furthermore, in order to perform search over large training sets efficiently, the same framework was extended to find hamming distance measures associated with simple discriminators. By combining this discrete distance model with the continuous model, we obtain a hierarchical distance model that is both fast and accurate. Finally, the nearest neighbor search over object part classes was integrated into a whole object detection system and evaluated against an indoor detection task yielding good results."
    @inproceedings{Mahamud_2003_4708,
    author = "Shyjan Mahamud and Martial Hebert",
    title = "The Optimal Distance Measure for Object Detection",
    booktitle = "IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)",
    year = "2003",
    pdf="http://www.ri.cmu.edu/pub_files/pub4/mahamud_shyjan_2003_3/mahamud_shyjan_2003_3.pdf",
    url="http://www.ri.cmu.edu/pubs/pub_4708.html",
    abstract="We develop a multi-class object detection framework whose core component is a nearest neighbor search over object part classes. The performance of the overall system is critically dependent on the distance measure used in the nearest neighbor search. A distance measure that minimizes the mis-classification risk for the 1-nearest neighbor search can be shown to be the probability that a pair of input image measurements belong to different classes. In practice, we model the optimal distance measure using a linear logistic model that combines the discriminative powers of more elementary distance measures associated with a collection of simple to construct feature spaces like color, texture and local shape properties. Furthermore, in order to perform search over large training sets efficiently, the same framework was extended to find hamming distance measures associated with simple discriminators. By combining this discrete distance model with the continuous model, we obtain a hierarchical distance model that is both fast and accurate. Finally, the nearest neighbor search over object part classes was integrated into a whole object detection system and evaluated against an indoor detection task yielding good results." 
    }

  11. Shyjan Mahamud and Martial Hebert. Minimum Risk Distance Measure for Object Recognition. In IEEE International Conference on Computer Vision (ICCV), 2003. (url) (pdf)
    Abstract: "Recently, the optimal distance measure for a given object discrimination task under the nearest neighbor framework was derived. For ease of implementation and efficiency considerations, the optimal distance measure was approximated by combining more elementary distance measures defined on simple feature spaces. In this paper, we address two important issues that arise in practice for such an approach: (a) What form should the elementary distance measure in each feature space take? We motivate the need to use optimal distance measures in simple feature spaces as the elementary distance measures; such distance measures have the desirable property that they are invariant to distance-respecting transformations. (b) How do we combine the elementary distance measures? We present the precise statistical assumptions under which a linear logistic model holds exactly. We benchmark our model with three other methods on a challenging face discrimination task and show that our approach is competitive with the state of the art."
    @inproceedings{Mahamud_2003_4706,
    author = "Shyjan Mahamud and Martial Hebert",
    title = "Minimum Risk Distance Measure for Object Recognition",
    booktitle = "IEEE International Conference on Computer Vision (ICCV)",
    year = "2003",
    pdf="http://www.ri.cmu.edu/pub_files/pub4/mahamud_shyjan_2003_1/mahamud_shyjan_2003_1.pdf",
    url="http://www.ri.cmu.edu/pubs/pub_4706.html",
    abstract="Recently, the optimal distance measure for a given object discrimination task under the nearest neighbor framework was derived. For ease of implementation and efficiency considerations, the optimal distance measure was approximated by combining more elementary distance measures defined on simple feature spaces. In this paper, we address two important issues that arise in practice for such an approach: (a) What form should the elementary distance measure in each feature space take? We motivate the need to use optimal distance measures in simple feature spaces as the elementary distance measures; such distance measures have the desirable property that they are invariant to distance-respecting transformations. (b) How do we combine the elementary distance measures? We present the precise statistical assumptions under which a linear logistic model holds exactly. We benchmark our model with three other methods on a challenging face discrimination task and show that our approach is competitive with the state of the art." 
    }

  12. Aaron Christopher Morris, Raghavendra Rao Donamukkala, Anuj Kapuria, Aaron M Steinfeld, J. Matthews, J. Dunbar-Jacobs, and Sebastian Thrun. Robotic Walker that Provides Guidance. In Proceedings of the 2003 IEEE Conference on Robotics and Automation (ICRA '03), May 2003.
    @inproceedings{Morris_2003_4476,
    author = "Aaron Christopher Morris and Raghavendra Rao Donamukkala and Anuj Kapuria and Aaron M Steinfeld and J. Matthews and J. Dunbar-Jacobs and Sebastian Thrun",
    title = "Robotic Walker that Provides Guidance",
    booktitle = "Proceedings of the 2003 IEEE Conference on Robotics and Automation (ICRA '03)",
    month = "May",
    year = "2003" 
    }

  13. Aaron Christopher Morris, Derek Kurth, Daniel Huber, Chuck Whittaker, and Scott Thayer. Case Studies of a Borehole Deployable Robot for Limestone Mine Profiling and Mapping. In International Conference on Field and Service Robotics (FSR), July 2003.
    Keywords: 3-D perception, geometric modeling, 3-D modeling, registration, mine mapping.
    @inproceedings{Morris_2003_4428,
    author = "Aaron Christopher Morris and Derek Kurth and Daniel Huber and Chuck Whittaker and Scott Thayer",
    title = "Case Studies of a Borehole Deployable Robot for Limestone Mine Profiling and Mapping",
    booktitle = "International Conference on Field and Service Robotics (FSR)",
    month = "July",
    year = "2003",
    keywords="3-D perception, geometric modeling, 3-D modeling, registration, mine mapping" 
    }

  14. Bart Nabbe and Martial Hebert. Where and When to Look. In IROS 2003, October 2003. IEEE. (pdf)
    Keywords: outdoor navigation, dynamic planning, mid-range sensing, wide baseline stereo.
    @inproceedings{Nabbe_2003_4581,
    author = "Bart Nabbe and Martial Hebert",
    title = "Where and When to Look",
    booktitle = "IROS 2003",
    month = "October",
    year = "2003",
    publisher = "IEEE",
    pdf="http://www.ri.cmu.edu/pub_files/pub4/nabbe_bart_2003_1/nabbe_bart_2003_1.pdf",
    keywords="outdoor navigation, dynamic planning, mid-range sensing, wide baseline stereo" 
    }

  15. Caroline Pantofaru, Ranjith Unnikrishnan, and Martial Hebert. Toward Generating Labeled Maps from Color and Range Data for Robot Navigation. In Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 2003. (pdf)
    Keywords: 3-D data, object recognition, image segmentation, sensor fusion, scene understanding.
    Abstract: "This paper addresses the problem of extracting information from range and color data acquired by a mobile robot in urban environments. Our approach extracts geometric structures from clouds of 3-D points and regions from the corresponding color images, labels them based on prior models of the objects expected in the environment - buildings in the current experiments - and combines the two sources of information into a composite labeled map. Ultimately, our goal is to generate maps that are segmented into objects of interest, each of which is labeled by its type, e.g., buildings, vegetation, etc. Such a map provides a higher-level representation of the environment than the geometric maps normally used for mobile robot navigation. The techniques presented here are a step toward the automatic construction of such labeled maps."
    @inproceedings{pantofaru-iros-03,
    author = "Caroline Pantofaru and Ranjith Unnikrishnan and Martial Hebert",
    title = "Toward Generating Labeled Maps from Color and Range Data for Robot Navigation",
    booktitle = "Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)",
    month = "October",
    year = "2003",
    pdf = "http://www.ri.cmu.edu/pub_files/pub4/pantofaru_caroline_2003_1/pantofaru_caroline_2003_1.pdf",
    keywords="3-D data, object recognition, image segmentation, sensor fusion, scene understanding",
    abstract="This paper addresses the problem of extracting information from range and color data acquired by a mobile robot in urban environments. Our approach extracts geometric structures from clouds of 3-D points and regions from the corresponding color images, labels them based on prior models of the objects expected in the environment - buildings in the current experiments - and combines the two sources of information into a composite labeled map. Ultimately, our goal is to generate maps that are segmented into objects of interest, each of which is labeled by its type, e.g., buildings, vegetation, etc. Such a map provides a higher-level representation of the environment than the geometric maps normally used for mobile robot navigation. The techniques presented here are a step toward the automatic construction of such labeled maps." 
    }

  16. Henry Schneiderman. Learning Statistical Structure for Object Detection. In Computer Analysis of Images and Patterns (CAIP), 2003, August 2003. Springer-Verlag. (url)
    Abstract: "Many classes of images exhibit sparse structuring of statistical dependency. Each variable has strong statistical dependency with a small number of other variables and negligible dependency with the remaining ones. Such structuring makes it possible to construct a powerful classifier by only representing the stronger dependencies among the variables. In particular, a semi-naïve Bayes classifier compactly represents sparseness. A semi-naïve Bayes classifier decomposes the input variables into subsets and represents statistical dependency within each subset, while treating the subsets as statistically inde-pendent. However, learning the structure of a semi-naïve Bayes classifier is known to be NP complete. The high dimensionality of images makes statistical structure learning especially challenging. This paper describes an algorithm that searches for the structure of a semi-naïve Bayes classifier in this large space of possible structures. The algorithm seeks to optimize two cost functions: a localized error in the log-likelihood ratio function to restrict the structure and a global classification error to choose the final structure. We use this approach to train detectors for several objects including faces, eyes, ears, telephones, push-carts, and door-handles. These detectors perform robustly with a high detection rate and low false alarm rate in unconstrained settings over a wide range of variation in background scenery and lighting."
    @inproceedings{Schneiderman_2003_4413,
    author = "Henry Schneiderman",
    title = "Learning Statistical Structure for Object Detection",
    booktitle = "Computer Analysis of Images and Patterns (CAIP), 2003",
    month = "August",
    year = "2003",
    publisher = "Springer-Verlag",
    url ="http://www.ri.cmu.edu/pubs/pub_4413.html",
    abstract="Many classes of images exhibit sparse structuring of statistical dependency. Each variable has strong statistical dependency with a small number of other variables and negligible dependency with the remaining ones. Such structuring makes it possible to construct a powerful classifier by only representing the stronger dependencies among the variables. In particular, a semi-naïve Bayes classifier compactly represents sparseness. A semi-naïve Bayes classifier decomposes the input variables into subsets and represents statistical dependency within each subset, while treating the subsets as statistically inde-pendent. However, learning the structure of a semi-naïve Bayes classifier is known to be NP complete. The high dimensionality of images makes statistical structure learning especially challenging. This paper describes an algorithm that searches for the structure of a semi-naïve Bayes classifier in this large space of possible structures. The algorithm seeks to optimize two cost functions: a localized error in the log-likelihood ratio function to restrict the structure and a global classification error to choose the final structure. We use this approach to train detectors for several objects including faces, eyes, ears, telephones, push-carts, and door-handles. These detectors perform robustly with a high detection rate and low false alarm rate in unconstrained settings over a wide range of variation in background scenery and lighting.",
    url="http://www.ri.cmu.edu/pubs/pub_4413.html" 
    }

  17. Ranjith Unnikrishnan and Martial Hebert. Robust Extraction of Multiple Structures from Non-uniformly Sampled Data. In Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '03), volume 2, pages 1322-29, October 2003. (pdf)
    Keywords: mobile robot, 3-D data, object recognition, nonparametric statistics, robust estimation, scene understanding.
    Abstract: "The extraction of multiple coherent structures from point clouds is crucial to the problem of scene modeling. While many statistical methods exist for robust estimation from noisy data, they are inadequate for addressing issues of scale, semi-structured clutter, and large point density variation together with the computational restrictions of autonomous navigation. This paper extends an approach of nonparametric projection-pursuit based regression to compensate for the non-uniform and directional nature of data sampled in outdoor environments. The proposed algorithm is employed for extraction of planar structures and clutter grouping. Results are shown for scene abstraction of 3-D range data in large urban scenes."
    @inproceedings{Unnikrishnan_2003_4589,
    author = "Ranjith Unnikrishnan and Martial Hebert",
    title = "Robust Extraction of Multiple Structures from Non-uniformly Sampled Data",
    booktitle = "Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '03)",
    month = "October",
    year = "2003",
    volume = "2",
    pages = "1322-29",
    keywords = "mobile robot",
    pdf ="http://www.ri.cmu.edu/pub_files/pub4/unnikrishnan_ranjith_2003_1/unnikrishnan_ranjith_2003_1.pdf",
    keywords = {3-D data, object recognition, nonparametric statistics, robust estimation, scene understanding},
    abstract="The extraction of multiple coherent structures from point clouds is crucial to the problem of scene modeling. While many statistical methods exist for robust estimation from noisy data, they are inadequate for addressing issues of scale, semi-structured clutter, and large point density variation together with the computational restrictions of autonomous navigation. This paper extends an approach of nonparametric projection-pursuit based regression to compensate for the non-uniform and directional nature of data sampled in outdoor environments. The proposed algorithm is employed for extraction of planar structures and clutter grouping. Results are shown for scene abstraction of 3-D range data in large urban scenes." 
    }

  18. Nicolas Vandapel, Raghavendra Rao Donamukkala, and Martial Hebert. Experimental Results in Using Aerial LADAR Data for Mobile Robot Navigation. In International Conference on Field and Service Robotics, 2003. (pdf)
    @inproceedings{vandapel-fsr-03,
    author = "Nicolas Vandapel and Raghavendra Rao Donamukkala and Martial Hebert",
    title = "Experimental Results in Using Aerial LADAR Data for Mobile Robot Navigation",
    booktitle = "International Conference on Field and Service Robotics",
    year = "2003",
    pdf ="http://www.ri.cmu.edu/pub_files/pub4/vandapel_nicolas_2003_1/vandapel_nicolas_2003_1.pdf" 
    }

  19. Nicolas Vandapel, Raghavendra Rao Donamukkala, and Martial Hebert. Quality Assessment of Traversability Maps from Aerial LIDAR Data for an Unmanned Ground Vehicle. In International Conference on Intelligent Robots and Systems (IROS), October 2003. (pdf)
    @inproceedings{vandapel-iros-03,
    author = "Nicolas Vandapel and Raghavendra Rao Donamukkala and Martial Hebert",
    title = "Quality Assessment of Traversability Maps from Aerial LIDAR Data for an Unmanned Ground Vehicle",
    booktitle = "International Conference on Intelligent Robots and Systems (IROS)",
    month = "October",
    year = "2003",
    pdf ="http://www.ri.cmu.edu/pub_files/pub4/vandapel_nicolas_2003_2/vandapel_nicolas_2003_2.pdf" 
    }

  20. Jerome Vignola, Jean-Francois Lalonde, and Robert Bergevin. Progressive Human Skeleton Fitting. In Proceedings of the 16th Conference on Vision Interface, 2003.
    Annotation: {This paper proposes a method to fit a skeleton or stick-model to a blob to determine the pose of a person in an image. The input is a binary image representing the silhouette of a person and the ouput is a stick-model coherent with the pose of the person in this image. A torso model is first defined, and is then scaled and fitted to the blob using the distance transform of the original image. Then, the fitting is performed independently for each of the four limbs (two arms, two legs), using again the distance transform. The fact that each limb is fitted independently speeds-up the fitting process, avoiding the combinatorial complexity problems that are frequent with this type of method.} url="http://vision.gel.ulaval.ca/fr/publications/Id_444/PublDetails.php" keywords= "pose recognition" .

    @InProceedings{vignola-vi-03,
    author = {Jerome Vignola and Jean-Francois Lalonde and Robert Bergevin},
    title = {Progressive Human Skeleton Fitting},
    booktitle = {Proceedings of the 16th Conference on Vision Interface},
    year = 2003,
    annote = {This paper proposes a method to fit a skeleton or stick-model to a blob to determine the pose of a person in an image. The input is a binary image representing the silhouette of a person and the ouput is a stick-model coherent with the pose of the person in this image. A torso model is first defined, and is then scaled and fitted to the blob using the distance transform of the original image. Then, the fitting is performed independently for each of the four limbs (two arms, two legs), using again the distance transform. The fact that each limb is fitted independently speeds-up the fitting process, avoiding the combinatorial complexity problems that are frequent with this type of method.} url="http://vision.gel.ulaval.ca/fr/publications/Id_444/PublDetails.php" keywords= "pose recognition" 
    }

BACK TO INDEX


The VMR Lab is part of the Vision and Autonomous Systems Center within the Robotics Institute in the School of Computer Science, Carnegie Mellon University.
This page was generated by a modified version of bibtex2html written by Gregoire Malandain