Colibri Core
Public Member Functions | Public Attributes | List of all members
PatternModelOptions Class Reference

Options for Pattern Model loading and training. More...

#include <patternmodel.h>

Public Member Functions

 PatternModelOptions ()
 
 PatternModelOptions (const PatternModelOptions &ref)
 

Public Attributes

int MINTOKENS
 
int MINTOKENS_SKIPGRAMS
 
int MINTOKENS_UNIGRAMS
 
int MINLENGTH
 The minimum length of patterns to be loaded/extracted (in words/tokens) (default: 1) More...
 
int MAXLENGTH
 The maximum length of patterns to be loaded/extracted, inclusive (in words/tokens) (default: 100) More...
 
int MAXBACKOFFLENGTH
 
bool DOSKIPGRAMS
 Load/extract skipgrams? (default: false) More...
 
bool DOSKIPGRAMS_EXHAUSTIVE
 Load/extract skipgrams in an exhaustive fashion? More memory intensive, but the only options for unindexed models (default: false) More...
 
int MINSKIPTYPES
 Minimum required amount of distinct patterns that can fit in a gap of a skipgram for the skipgram to be included (default: 2) More...
 
int MAXSKIPS
 Maximum skips per skipgram. More...
 
bool DOREVERSEINDEX
 Obsolete now, only here for backward-compatibility with v1. More...
 
bool DOPATTERNPERLINE
 Assume each line contains one integral pattern, rather than actively extracting all subpatterns on a line (default: false) More...
 
int PRUNENONSUBSUMED
 
bool DOREMOVEINDEX
 Do not load index information (for indexed models), loads just the patterns without any counts. More...
 
bool DOREMOVENGRAMS
 Remove n-grams from the model upon loading it. More...
 
bool DOREMOVESKIPGRAMS
 Remove skip-grams from the model upon loading it. More...
 
bool DOREMOVEFLEXGRAMS
 Remove flexgrams from the model upon loading it. More...
 
bool DORESET
 sets all counts to zero upon loading, clears indices More...
 
bool QUIET
 Don't output to stderr. More...
 
bool DEBUG
 Output extra debug information. More...
 

Detailed Description

Options for Pattern Model loading and training.

This class defines all kinds of parameters that can be set for loading and training Pattern Models, it is passed to various constructors and methods.

Constructor & Destructor Documentation

PatternModelOptions::PatternModelOptions ( )
inline

Initialise with default values. All members are public and can be set explicitly..

PatternModelOptions::PatternModelOptions ( const PatternModelOptions ref)
inline

Copy constructor

Member Data Documentation

bool PatternModelOptions::DEBUG

Output extra debug information.

bool PatternModelOptions::DOPATTERNPERLINE

Assume each line contains one integral pattern, rather than actively extracting all subpatterns on a line (default: false)

bool PatternModelOptions::DOREMOVEFLEXGRAMS

Remove flexgrams from the model upon loading it.

bool PatternModelOptions::DOREMOVEINDEX

Do not load index information (for indexed models), loads just the patterns without any counts.

bool PatternModelOptions::DOREMOVENGRAMS

Remove n-grams from the model upon loading it.

bool PatternModelOptions::DOREMOVESKIPGRAMS

Remove skip-grams from the model upon loading it.

bool PatternModelOptions::DORESET

sets all counts to zero upon loading, clears indices

bool PatternModelOptions::DOREVERSEINDEX

Obsolete now, only here for backward-compatibility with v1.

bool PatternModelOptions::DOSKIPGRAMS

Load/extract skipgrams? (default: false)

bool PatternModelOptions::DOSKIPGRAMS_EXHAUSTIVE

Load/extract skipgrams in an exhaustive fashion? More memory intensive, but the only options for unindexed models (default: false)

int PatternModelOptions::MAXBACKOFFLENGTH

Counting n-grams is done iteratively for each increasing n. (default: MAXLENGTH) For each n, presence of sub-ngrams in n-1 is checked. This values defines a maximum length for this back-off check. In combination with MINLENGTH, this allows earlier pruning and conserves memory.

int PatternModelOptions::MAXLENGTH

The maximum length of patterns to be loaded/extracted, inclusive (in words/tokens) (default: 100)

int PatternModelOptions::MAXSKIPS

Maximum skips per skipgram.

int PatternModelOptions::MINLENGTH

The minimum length of patterns to be loaded/extracted (in words/tokens) (default: 1)

int PatternModelOptions::MINSKIPTYPES

Minimum required amount of distinct patterns that can fit in a gap of a skipgram for the skipgram to be included (default: 2)

int PatternModelOptions::MINTOKENS

The occurrence threshold, minimum amount of occurrences for a pattern to be included in a model Defaults to 2 for building, to 1 for loading.

int PatternModelOptions::MINTOKENS_SKIPGRAMS

The occurrence threshold for skipgrams, minimum amount of occurrences for a pattern to be included in a model. Defaults to the same value as MINTOKENS. Only used if DOSKIPGRAMS or DO_SKIPGRAMS_EXHAUSTIVE is set to true

int PatternModelOptions::MINTOKENS_UNIGRAMS

The occurrence threshold for unigrams, unigrams must occur at least this many times for higher-order ngram/skipgram to be included in a model Defaults to the same value as MINTOKENS Only has an effect if MINTOKENS_UNIGRAMS > MINTOKENS.

int PatternModelOptions::PRUNENONSUBSUMED
bool PatternModelOptions::QUIET

Don't output to stderr.


The documentation for this class was generated from the following file: