Main Page   Namespace List   Class Hierarchy   Alphabetical List   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

TextHandler Class Reference

#include <TextHandler.hpp>

Inheritance diagram for TextHandler:

BrillPOSTokenizer CtfIndexer DocFreqIndexer DocOffsetParser FlattextDocMgr FreqCounter IndriTextHandler InvFPTextHandler KeyfileDocMgr KeyfileTextHandler MemParser Parser PropIndexTH QueryDocument QueryTextHandler Stemmer Stopper StringQuery WriterInQueryHandler WriterTextHandler List of all members.

Public Types

enum  TokenType {
  BEGINDOC = 1, ENDDOC = 2, WORDTOK = 3, BEGINTAG = 4,
  ENDTAG = 5, SYMBOLTOK = 6
}

Public Methods

 TextHandler ()
virtual ~TextHandler ()
virtual void setTextHandler (TextHandler *th)
 Set the TextHandler that this TextHandler will pass information on to.

virtual TextHandler * getTextHandler ()
 Set the TextHandler that this TextHandler will pass information on to.

virtual void foundToken (TokenType type, char *token=NULL, const char *orig=NULL, PropertyList *properties=NULL)
virtual char * handleBeginDoc (char *docno, const char *original, PropertyList *list)
virtual char * handleEndDoc (char *token, const char *original, PropertyList *list)
virtual char * handleWord (char *word, const char *original, PropertyList *list)
virtual char * handleBeginTag (char *tag, const char *original, PropertyList *list)
 Handle a begin tag.

virtual char * handleEndTag (char *tag, const char *original, PropertyList *list)
 Handle an end tag.

virtual char * handleSymbol (char *symbol, const char *original, PropertyList *list)
virtual void foundDoc (char *docno)
 Found a document with document number.

virtual void foundDoc (char *docno, const char *original)
virtual void foundWord (char *word)
 Found a word.

virtual void foundWord (char *word, const char *original)
virtual void foundEndDoc ()
 Found end of doc.

virtual void foundSymbol (char *sym)
 Found a word.

virtual char * handleDoc (char *docno)
 Handle a doc.

virtual char * handleWord (char *word)
 Handle a word, possibly transforming it.

virtual void handleEndDoc ()
 Handle the end of the doc.

virtual char * handleSymbol (char *sym)
 Handle a word, possibly transforming it.

virtual string getCategory ()
 Return the category TextHandler this is.

virtual string getIdentifier ()
 Return a unique identifier for this TextHandler object.


Static Public Attributes

const string category = "TextHandler"
const string identifier = "TextHandler"

Protected Attributes

TextHandler * textHandler
 The next textHandler in the chain.

string cat
string iden
char buffer [MAXWORDSIZE]

Detailed Description

TextHandlers have their own internal buffer for modification of the string. The foundWord function copies the word into the buffer then calls handleWord with the copy. The handleWord function may then modify the string and return the pointer to the string. This process is also done for foundDoc/handleDoc.


Member Enumeration Documentation

enum TextHandler::TokenType
 

Enumeration values:
BEGINDOC 
ENDDOC 
WORDTOK 
BEGINTAG 
ENDTAG 
SYMBOLTOK 


Constructor & Destructor Documentation

TextHandler::TextHandler   [inline]
 

virtual TextHandler::~TextHandler   [inline, virtual]
 


Member Function Documentation

virtual void TextHandler::foundDoc char *    docno,
const char *    original
[inline, virtual]
 

virtual void TextHandler::foundDoc char *    docno [inline, virtual]
 

Found a document with document number.

virtual void TextHandler::foundEndDoc   [inline, virtual]
 

Found end of doc.

virtual void TextHandler::foundSymbol char *    sym [inline, virtual]
 

Found a word.

virtual void TextHandler::foundToken TokenType    type,
char *    token = NULL,
const char *    orig = NULL,
PropertyList   properties = NULL
[inline, virtual]
 

virtual void TextHandler::foundWord char *    word,
const char *    original
[inline, virtual]
 

virtual void TextHandler::foundWord char *    word [inline, virtual]
 

Found a word.

virtual string TextHandler::getCategory   [inline, virtual]
 

Return the category TextHandler this is.

virtual string TextHandler::getIdentifier   [inline, virtual]
 

Return a unique identifier for this TextHandler object.

virtual TextHandler* TextHandler::getTextHandler   [inline, virtual]
 

Set the TextHandler that this TextHandler will pass information on to.

virtual char* TextHandler::handleBeginDoc char *    docno,
const char *    original,
PropertyList   list
[inline, virtual]
 

Handle a doc begin - default implementation calls handleDoc for backwords compat

virtual char* TextHandler::handleBeginTag char *    tag,
const char *    original,
PropertyList   list
[inline, virtual]
 

Handle a begin tag.

Reimplemented in IndriTextHandler, and ElemDocMgr.

virtual char* TextHandler::handleDoc char *    docno [inline, virtual]
 

Handle a doc.

Reimplemented in DocFreqIndexer, FreqCounter, IndriTextHandler, InvFPTextHandler, KeyfileTextHandler, PropIndexTH, FlattextDocMgr, KeyfileDocMgr, WriterInQueryHandler, and WriterTextHandler.

virtual void TextHandler::handleEndDoc   [inline, virtual]
 

Handle the end of the doc.

Reimplemented in DocFreqIndexer, IndriTextHandler, FlattextDocMgr, and KeyfileDocMgr.

virtual char* TextHandler::handleEndDoc char *    token,
const char *    original,
PropertyList   list
[inline, virtual]
 

Handle a doc end - default implementation calls old handleEndDoc for backwords compat

virtual char* TextHandler::handleEndTag char *    tag,
const char *    original,
PropertyList   list
[inline, virtual]
 

Handle an end tag.

Reimplemented in IndriTextHandler, and ElemDocMgr.

virtual char* TextHandler::handleSymbol char *    sym [inline, virtual]
 

Handle a word, possibly transforming it.

Reimplemented in WriterInQueryHandler, StringQuery, and QueryDocument.

virtual char* TextHandler::handleSymbol char *    symbol,
const char *    original,
PropertyList   list
[inline, virtual]
 

Handle a symbol - default implementation calls old handleSymbol for backwords compat

virtual char* TextHandler::handleWord char *    word [inline, virtual]
 

Handle a word, possibly transforming it.

Reimplemented in CtfIndexer, DocFreqIndexer, FreqCounter, InvFPTextHandler, KeyfileTextHandler, QueryTextHandler, KeyfileDocMgr, Stemmer, Stopper, WriterInQueryHandler, WriterTextHandler, StringQuery, DocOffsetParser, and QueryDocument.

virtual char* TextHandler::handleWord char *    word,
const char *    original,
PropertyList   list
[inline, virtual]
 

Handle a word - default implementation calls old handleWord for backwords compat

Reimplemented in IndriTextHandler, PropIndexTH, and BrillPOSTokenizer.

virtual void TextHandler::setTextHandler TextHandler *    th [inline, virtual]
 

Set the TextHandler that this TextHandler will pass information on to.


Member Data Documentation

char TextHandler::buffer[MAXWORDSIZE] [protected]
 

string TextHandler::cat [protected]
 

const string TextHandler::category = "TextHandler" [static]
 

Reimplemented in Parser, Stemmer, and Stopper.

string TextHandler::iden [protected]
 

const string TextHandler::identifier = "TextHandler" [static]
 

Reimplemented in ArabicParser, ArabicStemmer, BrillPOSParser, ChineseCharParser, ChineseParser, IdentifinderParser, InqArabicParser, InQueryOpParser, KStemmer, Parser, PorterStemmer, ReutersParser, Stemmer, Stopper, TrecParser, and WebParser.

TextHandler* TextHandler::textHandler [protected]
 

The next textHandler in the chain.


The documentation for this class was generated from the following files:
Generated on Wed Nov 3 12:59:57 2004 for Lemur Toolkit by doxygen1.2.18