Main Page   Namespace List   Class Hierarchy   Alphabetical List   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

ChineseCharParser Class Reference

#include <ChineseCharParser.hpp>

Inheritance diagram for ChineseCharParser:

Parser TextHandler List of all members.

Public Methods

 ChineseCharParser ()
void parseFile (const string &filename)
 Parse a file.

void parseBuffer (char *buf, int len)
 Parse a buffer of len length.

long fileTell () const

Static Public Attributes

const string identifier = "chinesechar"

Detailed Description

Parses unsegmented Chinese documents in NIST's TREC format, (GB encoding), producing character at a time tokens. The following fields are parsed: TEXT, HL, HEAD, HEADLINE, LP, TTL


Constructor & Destructor Documentation

ChineseCharParser::ChineseCharParser  
 


Member Function Documentation

long ChineseCharParser::fileTell   [virtual]
 

Gives current byte position offset into file being parsed. Don't use with parseBuffer

Implements Parser.

void ChineseCharParser::parseBuffer char *    buf,
int    len
[virtual]
 

Parse a buffer of len length.

Implements Parser.

void ChineseCharParser::parseFile const string &    filename [virtual]
 

Parse a file.

Implements Parser.


Member Data Documentation

const string ChineseCharParser::identifier = "chinesechar" [static]
 

Reimplemented from Parser.


The documentation for this class was generated from the following files:
Generated on Wed Nov 3 12:59:26 2004 for Lemur Toolkit by doxygen1.2.18