Machine Translation at Carnegie Mellon


home | seminar schedule | members | MT projects | links | Wiki

Internal site: Wiki (can only be viewed and edited from hosts in the domains cmu.edu, uka.de, uni-karlsruhe.de)


Next MT lunch
      
Nguyen Bach
Apr 22, 2008 Noon to 1:30pm, in Newell-Simon Hall 3305
Title: "Simulating Sentence Pairs Sampling Process via Source and Target Language Models"
Abstract: In a traditional word alignment process, each sentence pair is equally assigned an occurrence number, which is normalized during the training to produce the empirical probability. However, some sentences could be more valuable, reliable and appropriate than others. These sentences should therefore have a higher weight in the training. To solve this problem, we explored methods of resampling sentence pairs. We investigated three sets of features: sentence pair confidence (/sc/), genre-dependent sentence pair confidence (/gdsc/) and sentence-dependent phrase alignment confidence (/sdpc/) scores. These features were calculated over an entire training corpus and could easily be integrated into the phrase-based machine translation system.


Past MT lunches

  • Mar 19, 2008 Presenter: Matthias Eck, Communicating Unknown Words in Machine Translation
  • Nov 13, 2007 Presenter: Alok Parlikar, (S (NP (NP Trees) (SBAR (WHNP that) (S (VP can)))) (VP help))
  • Oct 09, 2007 Presenter: Aaron Phillips, "Sub-Phrasal Matching and Structural Templates in Example-Based MT"
  • Aug 14, 2007 Presenter: Sanjika Hewavitharana, "Experiments with a Noun-Phrase driven Statistical Machine Translation System"
  • May 22, 2007 Presenter: Abhaya Agarwal, "In Search of Better MT Evaluation Metric : Some Experiments"
  • Apr 17, 2007 Presenter: Lori Levin and Alison Alvarez, "An Assessment of Language Elicitation without the Supervision of a Linguist"
  • Feb 20, 2007 Presenter: Matthias Eck, "Translation Model Pruning"
  • Jan 16, 2007 Presenter: Ying(Joy) Zhang, "SALM: Suffix Array and its Applications in Empirical Language Processing" (link)
  • Oct 17, 2006 Presenter: Ian Lane, "Tighter Coupling of ASR+MT: Initial Experiments & Future Directions"
  • May 23, 2006 Presenter: Jae Dong Kim, "Anchor-Based Symmetric Probabilistic Alignment"
  • April 18, 2006 Presenter: Ari Font-Llitjos, "Can the Internet help improve Machine Translation?"
  • February 21, 2006 Presenter: Alison Alvarez, "The MILE (Minor Language Elicitation) Corpus for Less Commonly Taught Languages".
  • December 08, 2005. Presenter: Ralf Brown, "Generalization and Context-Sensitivity for Example-Based Machine Translation".
  • November 15, 2005. Presenter: Chiori Hori, "Speech Translation with Multiple Speech and Translation Hypotheses".
  • October 18, 2005. Presenters: Rebecca Hwa and Carol Nichols from the NLP Lab at Pitt, "Adapting Resources for Parsing Arabic Dialects".
  • September 13, 2005. Presenter: Fei Huang, "Named Entity Extraction and Translation from Multimedia Documents".
  • August 23, 2005. Presenter: Joy (Ying Zhang) , "David Chiang's ACL-2005 paper : A Hierarchical Phrase-Based Model for Statistical Machine Translation".
  • June 21, 2005. Presenter: Alison Alvarez, "Semi-Automated Elicitation Corpus Generation".
  • May 10, 2005. Presenter: Benjamin Han, "Using Collocations to Assess MT Quality".
  • April 19, 2005. Presenters: Joy (Ying Zhang) and Fei Huang, "Mining Translations for Key Phrases from Web Corpora".
  • March 22, 2005. Presenter: Shyamsundar Jayaraman, "A New Approach for Multi-Engine MT" (MEMT)
  • February 22, 2005. Presenter: Alon Lavie, "The METEOR metric" (MT Evaluation)
  • January 18, 2005.
  • December 14, 2004. Presenters: Violetta Cavalli-Sforza and Stephan Vogel

  • Comments? Something not working? Links to add? Contact Jaedy Kim