Class: Sooth::Predictor

Inherits:
Object
  • Object
show all
Defined in:
ext/sooth_native/native.c,
ext/sooth_native/native.c

Overview

A very simple stochastic predictor. Implemented in C for efficiency. The idea here is to build up more complicated learning algorithms using a trivial Markovian predictor.

Instance Method Summary collapse

Constructor Details

#initialize(error_symbol) ⇒ Object

Returns a new Sooth::Predictor instance.

Parameters:

  • error_symbol (Fixnum)

    The symbol to be returned by #select when no prediction can be made.



52
# File 'ext/sooth_native/native.c', line 52

VALUE method_sooth_native_initialize(VALUE self, VALUE error_symbol);

Instance Method Details

#clearObject

Clear the predictor to a fresh slate.



57
58
# File 'ext/sooth_native/native.c', line 57

def clear
end

#count(bigram) ⇒ Fixnum

Return a count of the number of times the bigram has been observed.

Parameters:

  • bigram (Array)

    A pair of symbols.

Returns:

  • (Fixnum)

    A count of the number of times the bigram has been observed. This is guaranteed to be equal to the sum of the counts of observations of all symbols in the context of the bigram.



17
18
19
# File 'ext/sooth_native/native.c', line 17

def count(bigram)
  # (native code)
end

#load(filename) ⇒ Object

Load the predictor from the specified filename. The predictor will be cleared before the file is loaded.

Parameters:

  • filename (String)

    The path of the file to be loaded.



65
66
# File 'ext/sooth_native/native.c', line 65

def load(filename)
end

#observe(bigram, symbol) ⇒ Fixnum

Add an observation of the given symbol in the context of the bigram.

Parameters:

  • bigram (Array)

    A pair of symbols that provide context, allowing the predictor to maintain observation statistics for different contexts.

  • symbol (Fixnum)

    The symbol that has been observed.

Returns:

  • (Fixnum)

    A count of the number of times the symbol has been observed in the context of the bigram.



84
85
86
# File 'ext/sooth_native/native.c', line 84

def observe(bigram, symbol)
  # (native code)
end

#save(filename) ⇒ Object

Save the predictor to a file that can be loaded or merged later.

Parameters:

  • filename (String)

    The path of the file to be merge.



72
73
# File 'ext/sooth_native/native.c', line 72

def save(filename)
end

#select(bigram, limit) ⇒ Fixnum

Return a symbol that may occur in the context of the bigram. The limit is used to select a symbol. This is done by iterating through all of the symbols that have been observed in the context of the bigram, subtracting the observation count of each symbol from the supplied limit. For this reason, limit should be between 1 and the observation count of the bigram itself, as returned by #count.

Parameters:

  • bigram (Array)

    A pair of symbols.

  • limit (Fixnum)

    The total numbe of symbol observations to be analysed before returning a symbol.

Returns:

  • (Fixnum)

    A symbol that has been observed previously in the context of the bigram, or the error_symbol if no such symbol exists, or if the supplied limit was too large.



20
21
22
# File 'ext/sooth_native/native.c', line 20

def select(bigram, limit)
  # (native code)
end

#surprise(bigram, symbol) ⇒ Float

Return a number indicating the surprise received by the predictor when it observed the given symbol after the given bigram. Note that nil will be returned if the symbol has never been observed after the bigram.

Parameters:

  • bigram (Array)

    A pair of symbols.

  • symbol (Fixnum)

    The symbol that has been observed.

Returns:

  • (Float)

    The surprise, which is calculated to be the shannon pointwise mutual information of the symbol according to the probability distribution over the alphabet of symbols in the context of the bigram.



26
27
28
# File 'ext/sooth_native/native.c', line 26

def surprise(bigram, symbol)
  # (native code)
end

#uncertainty(bigram) ⇒ Float

Return a number indicating how uncertain the predictor is about which symbol is likely to be observed after the given bigram. Note that nil will be returned if the bigram has never been observed.

Parameters:

  • bigram (Array)

    A pair of symbols.

Returns:

  • (Float)

    The uncertainty, which is calculated to be the shannon entropy of the probability distribution over the alphabet of symbols in the context of the bigram.



23
24
25
# File 'ext/sooth_native/native.c', line 23

def uncertainty(bigram)
  # (native code)
end