Main Page   Namespace List   Class Hierarchy   Alphabetical List   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

KeyfileDocMgr Class Reference

#include <KeyfileDocMgr.hpp>

Inheritance diagram for KeyfileDocMgr:

DocumentManager TextHandler ElemDocMgr List of all members.

Public Methods

 KeyfileDocMgr ()
 default constructor

 KeyfileDocMgr (const string &name)
 KeyfileDocMgr (string name, string mode, string source)
virtual ~KeyfileDocMgr ()
char * getDoc (const string &docID) const
 return the document associated with this ID

virtual char * handleDoc (char *docno)
 add entry for new doc

virtual void handleEndDoc ()
 finish entry for current doc

virtual char * handleWord (char *word)
 Add start and end byte offsets for this term to the list of offsets.

virtual void setParser (Parser *p)
 set myParser to p

virtual ParsergetParser () const
 returns a handle to a Parser object that can handle parsing the raw format of these documents

virtual void buildMgr ()
virtual const string & getMyID () const
 return name of this document manager, with the file extension (.bdm).

vector< MatchgetOffsets (const string &docID) const
virtual bool open (const string &manname)
 Open and load the toc file manname.


Protected Methods

virtual void writeTOC ()
virtual bool loadTOC ()
bool loadFTFiles (const string &fn, int num)

Protected Attributes

Parsermyparser
vector< Matchoffsets
int numdocs
string pm
Keyfile poslookup
Keyfile doclookup
int dbcache
btl docEntry
char * myDoc
int doclen
string IDname
string IDnameext
vector< string > sources
int numOldSources
 how many sources already processed?

int fileid
bool ignoreDoc
 are we ignoring this document?


Detailed Description

Document manager using Keyfile for data storage. In addition to providing access to raw document text, also stores byte offsets (start and end byte) for each token within the document. Useful for passage windows or using query term match information for highlighting. Implements TextHandler interface for building the manager.


Constructor & Destructor Documentation

KeyfileDocMgr::KeyfileDocMgr   [inline]
 

default constructor

KeyfileDocMgr::KeyfileDocMgr const string &    name
 

constructor (for open) name = toc file for this manager (same as getMyID)

KeyfileDocMgr::KeyfileDocMgr string    name,
string    mode,
string    source
 

constructor (for build) name = what to name this manager mode = type of parser to use source = file with list of files this will manage

KeyfileDocMgr::~KeyfileDocMgr   [virtual]
 


Member Function Documentation

void KeyfileDocMgr::buildMgr   [virtual]
 

Build the document manager tables from the files previously provided in the constructor.

Implements DocumentManager.

char * KeyfileDocMgr::getDoc const string &    docID const [virtual]
 

return the document associated with this ID

Implements DocumentManager.

virtual const string& KeyfileDocMgr::getMyID   const [inline, virtual]
 

return name of this document manager, with the file extension (.bdm).

Implements DocumentManager.

vector< Match > KeyfileDocMgr::getOffsets const string &    docID const
 

get the array of Match entries for the tokens in the document named docID. The entries are indexed by token position (as is recorded in a TermInfoList object.

virtual Parser* KeyfileDocMgr::getParser   [inline, virtual]
 

returns a handle to a Parser object that can handle parsing the raw format of these documents

Implements DocumentManager.

char * KeyfileDocMgr::handleDoc char *    docno [virtual]
 

add entry for new doc

Reimplemented from TextHandler.

void KeyfileDocMgr::handleEndDoc   [virtual]
 

finish entry for current doc

Reimplemented from TextHandler.

virtual char* KeyfileDocMgr::handleWord char *    word [inline, virtual]
 

Add start and end byte offsets for this term to the list of offsets.

Reimplemented from TextHandler.

bool KeyfileDocMgr::loadFTFiles const string &    fn,
int    num
[protected]
 

bool KeyfileDocMgr::loadTOC   [protected, virtual]
 

Reimplemented in ElemDocMgr.

virtual bool KeyfileDocMgr::open const string &    manname [inline, virtual]
 

Open and load the toc file manname.

Implements DocumentManager.

Reimplemented in ElemDocMgr.

virtual void KeyfileDocMgr::setParser Parser   p [inline, virtual]
 

set myParser to p

void KeyfileDocMgr::writeTOC   [protected, virtual]
 

Reimplemented in ElemDocMgr.


Member Data Documentation

int KeyfileDocMgr::dbcache [protected]
 

btl KeyfileDocMgr::docEntry [protected]
 

int KeyfileDocMgr::doclen [protected]
 

Keyfile KeyfileDocMgr::doclookup [protected]
 

int KeyfileDocMgr::fileid [protected]
 

string KeyfileDocMgr::IDname [protected]
 

string KeyfileDocMgr::IDnameext [protected]
 

bool KeyfileDocMgr::ignoreDoc [protected]
 

are we ignoring this document?

char* KeyfileDocMgr::myDoc [protected]
 

Parser* KeyfileDocMgr::myparser [protected]
 

int KeyfileDocMgr::numdocs [protected]
 

int KeyfileDocMgr::numOldSources [protected]
 

how many sources already processed?

vector<Match> KeyfileDocMgr::offsets [protected]
 

string KeyfileDocMgr::pm [protected]
 

Keyfile KeyfileDocMgr::poslookup [protected]
 

vector<string> KeyfileDocMgr::sources [protected]
 


The documentation for this class was generated from the following files:
Generated on Wed Nov 3 12:59:43 2004 for Lemur Toolkit by doxygen1.2.18