William W. Cohen's Papers: Information Extraction
- Andrew Arnold and William W. Cohen (2008): Intra-document Structural Frequency Features for Semi-Supervised Domain Adaptation in CIKM-2008.
- Andrew Arnold, Ramesh Nallapati and William W. Cohen (2008): Exploiting Feature Hierarchy for Transfer Learning in Named Entity Recognition in ACL-2008.
- Andrew Arnold, Ramesh Nallapati and William W. Cohen (2007): A Comparative Study of Methods for Transductive Transfer Learning in ICDM Workshop on Mining and Management of Biological Data.
- Richard Wang and William Cohen (2007): Language-Independent Set Expansion of Named Entities using the Web in ICDM-2007.
- Zhenzhen Kou and William W. Cohen (2007): Stacked Graphical Models for Efficient Inference in Markov Random Fields in SDM-2007.
- Zhenzhen Kou, William W. Cohen, and Robert F. Murphy (2007): A Stacked Graphical Model for Associating Information from Text And Images In Figures in PSB-2007.
- Richard C. Wang, Anthony Tomasic, Robert E. Frederking, William W. Cohen (2006): Learning to Extract Gene-Protein Names from Weakly-Labeled Text in CMU SCS Technical Report Series (CMU-LTI-08-04).
- Einat Minkov, Richard C.Wang, Anthony Tomasic and William W. Cohen (2006): NER Systems that Suit Users Preferences: Adjusting the Recall-Precision Trade-off for Entity Extraction in HLT/NAACL-2006 (short paper).
- William W. Cohen (2006): A Graph-Search Framework for GeneId Ranking (Extended Abstract) in BioNLP'06.
- William W. Cohen & Einat Minkov (2006): A Graph-Search Framework for Associating Gene Identifiers with Documents in BMC Bioinformatics.
- Einat Minkov, Richard C. Wang, and William W. Cohen (2005): Extracting Personal Names from Email: Applying Named Entity Recognition to Informal Text in EMNLP/HLT-2005.
- William W. Cohen, Einat Minkov & Anthony Tomasic (2005): Learning to Understand Web Site Update Requests in IJCAI-2005.
- Zhenzhen Kou, William W. Cohen & Robert F. Murphy (2005): High-Recall Protein Entity Recognition Using a Dictionary in ISMB-2005.
- Einat Minkov, Richard Wang & William Cohen (2004): Extracting Personal Names from Emails: Applying Named Entity Recognition to Informal Text in preparation.
- Sunita Sarawagi & William W. Cohen (2004): Semi-Markov Conditional Random Fields for Information Extraction in NIPS 2004.
- Robert F. Murphy, Zhenzhen Kou, Juchang Hua, Matthew Joffe, William W. Cohen (2004): Extracting and Structuring Subcellular Location Information from On-line Journal Articles: The Subcellular Location Image Finder in KSCE-2004.
- Anthony Tomasic, William W. Cohen, Einat Minkov, ... (2004): Learning to Navigate Web Forms in IIWeb 2004.
- Vitor R. Carvalho & William W. Cohen (2004): Learning to Extract Signature and Reply Lines from Email in CEAS 2004.
- William W. Cohen & Sunita Sarawagi (2004): Exploiting Dictionaries in Named Entity Extraction: Combining Semi-Markov Extraction Processes and Data Integration Methods in KDD 2004: 89-98.
- William W. Cohen (2003): Learning and Discovering Structure in Web Pages in IEEE Data Eng. Bull. 26(3): 3-10 (2003).
- William W. Cohen, Zhenzhen Kou & Robert F. Murphy (2003): Extracting Information from Text and Images for Location Proteomics in BIOKDD 2003: 2-9.
- William W. Cohen, Richard Wang & Robert Murphy (2003): Understanding Captions in Biomedical Publications in KDD 2003: 499-504.
- William W. Cohen (2003): Infrastructure Components for Large-Scale Information Extraction Systems in IAAI 2003: 71-78.
- William W. Cohen (2002): Improving A Page Classifier with Anchor Extraction and Link Analysis in NIPS 2002.
- William W. Cohen, Matthew Hurst & Lee S. Jensen (2003): A Flexible Learning System for Wrapping Tables and Lists in HTML Documents in Web Document Analysis: Challenges and Opportunities, ed. Antonacopoulos & Hu, Word Scientific Publishing. (Originally published as: William W. Cohen, Matthew Hurst & Lee S. Jensen (2002): A Flexible Learning System for Wrapping Tables and Lists in HTML Documents in WWW 2002: 232-241; Lee S. Jensen & William W. Cohen (2001): A Structured Wrapper Induction System for Extracting Information from Semi-Structured Documents in Proc. of the IJCAI-2001 Workshop on Adaptive Text Extraction and Mining).
- William W. Cohen (2001): Issues in Extracting Information from the Web (Extended Abstract) in IWPT 2001.
- William W. Cohen (2000): Extracting Information from the Web for Concept Learning and Collaborative Filtering in ALT 2000: 1-12.
- William W. Cohen, Andrew McCallum, Dallan Quass (2000): Learning to Understand the Web in IEEE Data Eng. Bull. 23(3): 17-24 (2000).
- William W. Cohen and Wei Fan (1999): Learning Page-Independent Heuristics for Extracting Data from Web Pages in Computer Networks 31(11-16): 1641-1652 (1999). (Originally published as: William W. Cohen and Wei Fan (1999): Learning Page-Independent Heuristics for Extracting Data from Web Pages in WWW 1999).
- William W. Cohen (1999): Reasoning about Textual Similarity in a Web-Based Information Access in Autonomous Agents and Multi-Agent Systems 2(1): 65-86 (1999).
- William W. Cohen (1999): A Demonstration of WHIRL (demonstration abstract) in SIGIR 1999: 327.
[Selected papers| By topic: Matching/Data Integration| Text Categorization| Rule Learning| Explanation-Based Learning| Formal Results| Inductive Logic Programming| Information Extraction| Collaborative Filtering| Applications| By year: All papers| RSS]