Main Page   Namespace List   Class Hierarchy   Alphabetical List   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

InvPushIndex Class Reference

#include <InvPushIndex.hpp>

Inheritance diagram for InvPushIndex:

PushIndex InvFPPushIndex IncFPPushIndex InvPassagePushIndex IncPassagePushIndex List of all members.

Public Methods

 InvPushIndex ()
 InvPushIndex (const string &prefix, int cachesize=128000000, long maxfilesize=2100000000, DOCID_T startdocid=1)
 ~InvPushIndex ()
void setName (const string &prefix)
 sets the name for this index. the name will be the prefix for all files related to this index

bool beginDoc (const DocumentProps *dp)
 the beginning of a new document, returns true if initiation was successful

bool addTerm (const Term &t)
 adding a term to the current document, returns true if term was added successfully.

void endDoc (const DocumentProps *dp)
 signify the end of current document

virtual void endDoc (const DocumentProps *dp, const string &mgr)
 signify the end of current document and associate with certain document manager. this doesn't change the mgr that was previously set.

void endCollection (const CollectionProps *cp)
 signify the end of this collection. properties passed at the beginning of a collection should be handled by the constructor.

void setDocManager (const string &mgrID)
 set the document manager to use for succeeding documents


Protected Methods

void writeTOC (int numinv)
void writeDocIDs ()
void writeCache ()
void lastWriteCache ()
void writeDTIDs ()
void writeDocMgrIDs ()
int docMgrID (const string &mgr)
virtual void doendDoc (const DocumentProps *dp, int mgrid)

Protected Attributes

long maxfile
MemCachecache
 the biggest our file size can be

vector< EXDOCID_TdocIDs
 the main memory handler for building

vector< TERM_TtermIDs
 list of external docids in internal docid order

vector< string > tempfiles
 list of terms in termid order

vector< string > dtfiles
 list of tempfiles we've written to flush cache

vector< string > docmgrs
 list of dt index files

FILE * writetlookup
ofstream writetlist
 filestream for writing the lookup table to the docterm db

COUNT_T tcount
 filestream for writing the list of located terms for each document

COUNT_T tidcount
 count of total terms

COUNT_T dtidcount
 count of unique terms

string name
 count of unique terms in a current doc

TABLE_T wordtable
 the prefix name

map< TERMID_T, COUNT_Ttermlist
 table of all terms and their doclists

int * membuf
 maps of terms and freqs

int membufsize
 memory to use for cache and buffers

int curdocmgr

Constructor & Destructor Documentation

InvPushIndex::InvPushIndex   [inline]
 

InvPushIndex::InvPushIndex const string &    prefix,
int    cachesize = 128000000,
long    maxfilesize = 2100000000,
DOCID_T    startdocid = 1
 

InvPushIndex::~InvPushIndex  
 


Member Function Documentation

bool InvPushIndex::addTerm const Term   t [virtual]
 

adding a term to the current document, returns true if term was added successfully.

Implements PushIndex.

Reimplemented in IncPassagePushIndex, InvFPPushIndex, and InvPassagePushIndex.

bool InvPushIndex::beginDoc const DocumentProps   dp [virtual]
 

the beginning of a new document, returns true if initiation was successful

Implements PushIndex.

Reimplemented in IncPassagePushIndex, and InvPassagePushIndex.

int InvPushIndex::docMgrID const string &    mgr [protected]
 

returns the internal id of given docmgr if not already registered, mgr will be added

void InvPushIndex::doendDoc const DocumentProps   dp,
int    mgrid
[protected, virtual]
 

Reimplemented in IncPassagePushIndex, InvFPPushIndex, and InvPassagePushIndex.

void InvPushIndex::endCollection const CollectionProps   cp [virtual]
 

signify the end of this collection. properties passed at the beginning of a collection should be handled by the constructor.

Implements PushIndex.

Reimplemented in InvFPPushIndex.

void InvPushIndex::endDoc const DocumentProps   dp,
const string &    mgr
[virtual]
 

signify the end of current document and associate with certain document manager. this doesn't change the mgr that was previously set.

void InvPushIndex::endDoc const DocumentProps   dp [virtual]
 

signify the end of current document

Implements PushIndex.

void InvPushIndex::lastWriteCache   [protected]
 

void InvPushIndex::setDocManager const string &    mgrID [virtual]
 

set the document manager to use for succeeding documents

Implements PushIndex.

void InvPushIndex::setName const string &    prefix
 

sets the name for this index. the name will be the prefix for all files related to this index

void InvPushIndex::writeCache   [protected]
 

void InvPushIndex::writeDocIDs   [protected]
 

void InvPushIndex::writeDocMgrIDs   [protected]
 

void InvPushIndex::writeDTIDs   [protected]
 

void InvPushIndex::writeTOC int    numinv [protected]
 

Reimplemented in InvFPPushIndex.


Member Data Documentation

MemCache* InvPushIndex::cache [protected]
 

the biggest our file size can be

int InvPushIndex::curdocmgr [protected]
 

vector<EXDOCID_T> InvPushIndex::docIDs [protected]
 

the main memory handler for building

vector<string> InvPushIndex::docmgrs [protected]
 

list of dt index files

vector<string> InvPushIndex::dtfiles [protected]
 

list of tempfiles we've written to flush cache

COUNT_T InvPushIndex::dtidcount [protected]
 

count of unique terms

long InvPushIndex::maxfile [protected]
 

int* InvPushIndex::membuf [protected]
 

maps of terms and freqs

int InvPushIndex::membufsize [protected]
 

memory to use for cache and buffers

string InvPushIndex::name [protected]
 

count of unique terms in a current doc

COUNT_T InvPushIndex::tcount [protected]
 

filestream for writing the list of located terms for each document

vector<string> InvPushIndex::tempfiles [protected]
 

list of terms in termid order

vector<TERM_T> InvPushIndex::termIDs [protected]
 

list of external docids in internal docid order

map<TERMID_T, COUNT_T> InvPushIndex::termlist [protected]
 

table of all terms and their doclists

Reimplemented in InvFPPushIndex.

COUNT_T InvPushIndex::tidcount [protected]
 

count of total terms

TABLE_T InvPushIndex::wordtable [protected]
 

the prefix name

ofstream InvPushIndex::writetlist [protected]
 

filestream for writing the lookup table to the docterm db

FILE* InvPushIndex::writetlookup [protected]
 


The documentation for this class was generated from the following files:
Generated on Wed Nov 3 12:59:41 2004 for Lemur Toolkit by doxygen1.2.18