William W. Cohen's Papers: Information Extraction

[RSS feed]

  1. Richard Wang and William W. Cohen (2009): Character-level Analysis of Semi-Structured Documents for Set Expansion in EMNLP 2009.
  2. Richard Wang and William W. Cohen (2009): Automatic Set Instance Extraction using the Web in ACL-IJNLP 2009.
  3. Richard Wang and William W. Cohen (2008): Iterative Set Expansion of Named Entities Using the Web in ICDM-2008.
  4. Andrew Arnold and William W. Cohen (2008): Intra-document Structural Frequency Features for Semi-Supervised Domain Adaptation in CIKM-2008.
  5. Andrew Arnold, Ramesh Nallapati and William W. Cohen (2008): Exploiting Feature Hierarchy for Transfer Learning in Named Entity Recognition in ACL-2008.
  6. Andrew Arnold, Ramesh Nallapati and William W. Cohen (2007): A Comparative Study of Methods for Transductive Transfer Learning in ICDM Workshop on Mining and Management of Biological Data.
  7. Richard Wang and William Cohen (2007): Language-Independent Set Expansion of Named Entities using the Web in ICDM-2007.
  8. Zhenzhen Kou and William W. Cohen (2007): Stacked Graphical Models for Efficient Inference in Markov Random Fields in SDM-2007.
  9. Zhenzhen Kou, William W. Cohen, and Robert F. Murphy (2007): A Stacked Graphical Model for Associating Information from Text And Images In Figures in PSB-2007.
  10. Richard C. Wang, Anthony Tomasic, Robert E. Frederking, William W. Cohen (2006): Learning to Extract Gene-Protein Names from Weakly-Labeled Text in CMU SCS Technical Report Series (CMU-LTI-08-04).
  11. Einat Minkov, Richard C.Wang, Anthony Tomasic and William W. Cohen (2006): NER Systems that Suit Users Preferences: Adjusting the Recall-Precision Trade-off for Entity Extraction in HLT/NAACL-2006 (short paper).
  12. William W. Cohen (2006): A Graph-Search Framework for GeneId Ranking (Extended Abstract) in BioNLP'06.
  13. William W. Cohen & Einat Minkov (2006): A Graph-Search Framework for Associating Gene Identifiers with Documents in BMC Bioinformatics.
  14. Einat Minkov, Richard C. Wang, and William W. Cohen (2005): Extracting Personal Names from Email: Applying Named Entity Recognition to Informal Text in EMNLP/HLT-2005.
  15. William W. Cohen, Einat Minkov & Anthony Tomasic (2005): Learning to Understand Web Site Update Requests in IJCAI-2005.
  16. Zhenzhen Kou, William W. Cohen & Robert F. Murphy (2005): High-Recall Protein Entity Recognition Using a Dictionary in ISMB-2005.
  17. Einat Minkov, Richard Wang & William Cohen (2004): Extracting Personal Names from Emails: Applying Named Entity Recognition to Informal Text in preparation.
  18. Sunita Sarawagi & William W. Cohen (2004): Semi-Markov Conditional Random Fields for Information Extraction in NIPS 2004.
  19. Robert F. Murphy, Zhenzhen Kou, Juchang Hua, Matthew Joffe, William W. Cohen (2004): Extracting and Structuring Subcellular Location Information from On-line Journal Articles: The Subcellular Location Image Finder in KSCE-2004.
  20. Anthony Tomasic, William W. Cohen, Einat Minkov, ... (2004): Learning to Navigate Web Forms in IIWeb 2004.
  21. Vitor R. Carvalho & William W. Cohen (2004): Learning to Extract Signature and Reply Lines from Email in CEAS 2004.
  22. William W. Cohen & Sunita Sarawagi (2004): Exploiting Dictionaries in Named Entity Extraction: Combining Semi-Markov Extraction Processes and Data Integration Methods in KDD 2004: 89-98.
  23. William W. Cohen (2003): Learning and Discovering Structure in Web Pages in IEEE Data Eng. Bull. 26(3): 3-10 (2003).
  24. William W. Cohen, Zhenzhen Kou & Robert F. Murphy (2003): Extracting Information from Text and Images for Location Proteomics in BIOKDD 2003: 2-9.
  25. William W. Cohen, Richard Wang & Robert Murphy (2003): Understanding Captions in Biomedical Publications in KDD 2003: 499-504.
  26. William W. Cohen (2003): Infrastructure Components for Large-Scale Information Extraction Systems in IAAI 2003: 71-78.
  27. William W. Cohen (2002): Improving A Page Classifier with Anchor Extraction and Link Analysis in NIPS 2002.
  28. William W. Cohen, Matthew Hurst & Lee S. Jensen (2003): A Flexible Learning System for Wrapping Tables and Lists in HTML Documents in Web Document Analysis: Challenges and Opportunities, ed. Antonacopoulos & Hu, Word Scientific Publishing. (Originally published as: William W. Cohen, Matthew Hurst & Lee S. Jensen (2002): A Flexible Learning System for Wrapping Tables and Lists in HTML Documents in WWW 2002: 232-241; Lee S. Jensen & William W. Cohen (2001): A Structured Wrapper Induction System for Extracting Information from Semi-Structured Documents in Proc. of the IJCAI-2001 Workshop on Adaptive Text Extraction and Mining).
  29. William W. Cohen (2001): Issues in Extracting Information from the Web (Extended Abstract) in IWPT 2001.
  30. William W. Cohen (2000): Extracting Information from the Web for Concept Learning and Collaborative Filtering in ALT 2000: 1-12.
  31. William W. Cohen, Andrew McCallum, Dallan Quass (2000): Learning to Understand the Web in IEEE Data Eng. Bull. 23(3): 17-24 (2000).
  32. William W. Cohen and Wei Fan (1999): Learning Page-Independent Heuristics for Extracting Data from Web Pages in Computer Networks 31(11-16): 1641-1652 (1999). (Originally published as: William W. Cohen and Wei Fan (1999): Learning Page-Independent Heuristics for Extracting Data from Web Pages in WWW 1999).
  33. William W. Cohen (1999): Reasoning about Textual Similarity in a Web-Based Information Access in Autonomous Agents and Multi-Agent Systems 2(1): 65-86 (1999).
  34. William W. Cohen (1999): A Demonstration of WHIRL (demonstration abstract) in SIGIR 1999: 327.

[Selected papers| By topic: Matching/Data Integration| Text Categorization| Rule Learning| Explanation-Based Learning| Formal Results| Inductive Logic Programming| Information Extraction| Collaborative Filtering| Applications| By year: All papers| RSS]