Main Page   Namespace List   Class Hierarchy   Alphabetical List   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

IndexEnvironment Class Reference

#include <IndexEnvironment.hpp>

List of all members.

Public Methods

 IndexEnvironment ()
 ~IndexEnvironment ()
void setAnchorTextPath (const std::string &documentRoot, const std::string &anchorTextRoot)
void addFileClass (const std::string &name, const std::string &iterator, const std::string &parser, const std::string &startDocTag, const std::string &endDogTag, const std::string &endMetadataTag, const std::vector< std::string > &include, const std::vector< std::string > &exclude, const std::vector< std::string > &index, const std::vector< std::string > &metadata, const std::map< std::string, std::string > &conflations)
void setIndexedFields (const std::vector< std::string > &fieldNames)
void setNumericField (const std::string &fieldName, bool isNumeric)
void setMetadataIndexedFields (const std::vector< std::string > &fieldNames)
void setStopwords (const std::vector< std::string > &stopwords)
void setStemmer (const std::string &stemmer)
void setMemory (UINT64 memory)
void create (const std::string &repositoryPath, IndexStatus *callback=0)
void open (const std::string &repositoryPath, IndexStatus *callback=0)
void close ()
 close the index and repository

void addFile (const std::string &fileName)
void addFile (const std::string &fileName, const std::string &fileClass)
void addString (const std::string &documentString, const std::string &fileClass, const std::vector< MetadataPair > &metadata)
void addParsedDocument (ParsedDocument *document)


Detailed Description

Principal class for interacting with Indri indexes during index construction. Provides the API for opening or creating an index and its associated repository, setting indexing and text parsing parameters, and adding documents to the repository.


Constructor & Destructor Documentation

IndexEnvironment::IndexEnvironment  
 

IndexEnvironment::~IndexEnvironment  
 


Member Function Documentation

void IndexEnvironment::addFile const std::string &    fileName,
const std::string &    fileClass
 

add a file of the specified file class to the index and repository

Parameters:
fileName  the file to add
fileClass  the file class to add (eg trecweb).

void IndexEnvironment::addFile const std::string &    fileName
 

add a file to the index and repository

Parameters:
fileName  the file to add

void IndexEnvironment::addFileClass const std::string &    name,
const std::string &    iterator,
const std::string &    parser,
const std::string &    startDocTag,
const std::string &    endDogTag,
const std::string &    endMetadataTag,
const std::vector< std::string > &    include,
const std::vector< std::string > &    exclude,
const std::vector< std::string > &    index,
const std::vector< std::string > &    metadata,
const std::map< std::string, std::string > &    conflations
 

A parsing information for a file class. Data for these parameters is passed into the FileClassEnvironmentFactory

Parameters:
name  name of this file class, eg trecweb
iterator  document iterator for this file class
parser  document tokenizer for this file class
startDocTag  tag indicating start of a document
endDocTag  tag indicating the end of a document
endMetadataTag  tag indicating the end of the metadata fields
include  default tags whose contents should be included in the index
exclude  tags whose contents should be excluded from the index
index  tags that should be forwarded to the index for tag extents
metadata  tags whose contents should be indexed as metadata
conflations  tags that should be conflated

void IndexEnvironment::addParsedDocument ParsedDocument   document
 

add an already parsed document to the index and repository

Parameters:
document  the document to add

void IndexEnvironment::addString const std::string &    documentString,
const std::string &    fileClass,
const std::vector< MetadataPair > &    metadata
 

add a string to the index and repository

Parameters:
documentString  the document to add
fileClass  the file class to add (eg trecweb).
metadata  the metadata pairs associated with the string.

void IndexEnvironment::close  
 

close the index and repository

void IndexEnvironment::create const std::string &    repositoryPath,
IndexStatus   callback = 0
 

create a new index and repository

Parameters:
repositoryPath  the path to the repository
callback  IndexStatus object to be notified of indexing progress.

void IndexEnvironment::open const std::string &    repositoryPath,
IndexStatus   callback = 0
 

open an existing index and repository

Parameters:
repositoryPath  the path to the repository
callback  IndexStatus object to be notified of indexing progress.

void IndexEnvironment::setAnchorTextPath const std::string &    documentRoot,
const std::string &    anchorTextRoot
 

Set document root path and anchor text root path.

Parameters:
documentRoot  path to document root.
anchorTextRoot  path to anchor text root.

void IndexEnvironment::setIndexedFields const std::vector< std::string > &    fieldNames
 

set names of fields to be indexed as data

Parameters:
fieldNames  the list of fields.

void IndexEnvironment::setMemory UINT64    memory
 

set the amount of memory to use for internal structures

Parameters:
memory  the number of bytes to use.

void IndexEnvironment::setMetadataIndexedFields const std::vector< std::string > &    fieldNames
 

set names of fields to be indexed as metadata

Parameters:
fieldNames  the list of fields.

void IndexEnvironment::setNumericField const std::string &    fieldName,
bool    isNumeric
 

void IndexEnvironment::setStemmer const std::string &    stemmer
 

set the stemmer to use

Parameters:
stemmer  the stemmer to use. One of krovetz, porter

void IndexEnvironment::setStopwords const std::vector< std::string > &    stopwords
 

set the list of stopwords

Parameters:
stopwords  the list of stopwords


The documentation for this class was generated from the following files:
Generated on Wed Nov 3 12:59:39 2004 for Lemur Toolkit by doxygen1.2.18