Class: Pocketsphinx::Decoder

Inherits:
Struct
  • Object
show all
Includes:
API::CallHelpers
Defined in:
lib/pocketsphinx/decoder.rb

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods included from API::CallHelpers

#api_call

Instance Attribute Details

#configurationObject

Returns the value of attribute configuration

Returns:

  • (Object)

    the current value of configuration



2
3
4
# File 'lib/pocketsphinx/decoder.rb', line 2

def configuration
  @configuration
end

#ps_apiObject



127
128
129
# File 'lib/pocketsphinx/decoder.rb', line 127

def ps_api
  @ps_api || API::Pocketsphinx
end

Instance Method Details

#decode(audio_path_or_file, max_samples = 2048) ⇒ Object

Decode a raw audio stream as a single utterance, opening a file if path given

See #decode_raw

Parameters:

  • audio_path_or_file (IO)

    The raw audio stream or file path to decode as a single utterance

  • max_samples (Fixnum) (defaults to: 2048)

    The maximum samples to process from the stream on each iteration



25
26
27
28
29
30
31
32
# File 'lib/pocketsphinx/decoder.rb', line 25

def decode(audio_path_or_file, max_samples = 2048)
  case audio_path_or_file
  when String
    File.open(audio_path_or_file, 'rb') { |f| decode_raw(f, max_samples) }
  else
    decode_raw(audio_path_or_file, max_samples)
  end
end

#decode_raw(audio_file, max_samples = 2048) ⇒ Object

Decode a raw audio stream as a single utterance.

No headers are recognized in this files. The configuration parameters samprate and input_endian are used to determine the sampling rate and endianness of the stream, respectively. Audio is always assumed to be 16-bit signed PCM.

Parameters:

  • audio_file (IO)

    The raw audio stream to decode as a single utterance

  • max_samples (Fixnum) (defaults to: 2048)

    The maximum samples to process from the stream on each iteration



42
43
44
45
46
47
48
49
50
51
52
53
# File 'lib/pocketsphinx/decoder.rb', line 42

def decode_raw(audio_file, max_samples = 2048)
  start_utterance

  FFI::MemoryPointer.new(:int16, max_samples) do |buffer|
    while data = audio_file.read(max_samples * 2)
      buffer.write_string(data)
      process_raw(buffer, data.length / 2)
    end
  end

  end_utterance
end

#end_utteranceObject

End utterance processing



79
80
81
# File 'lib/pocketsphinx/decoder.rb', line 79

def end_utterance
  api_call :ps_end_utt, ps_decoder
end

#get_searchObject

Returns name of curent search in decoder



107
108
109
# File 'lib/pocketsphinx/decoder.rb', line 107

def get_search
  ps_api.ps_get_search(ps_decoder)
end

#hypothesisString

TODO:

Expand to return path score and utterance ID

Get hypothesis string and path score.

Returns:

  • (String)

    Hypothesis string



92
93
94
# File 'lib/pocketsphinx/decoder.rb', line 92

def hypothesis
  ps_api.ps_get_hyp(ps_decoder, nil, nil)
end

#in_speech?Boolean

Checks if the last feed audio buffer contained speech

Returns:

  • (Boolean)


84
85
86
# File 'lib/pocketsphinx/decoder.rb', line 84

def in_speech?
  ps_api.ps_get_in_speech(ps_decoder) != 0
end

#process_raw(buffer, size, no_search = false, full_utt = false) ⇒ Object

Decode raw audio data.

Parameters:

  • no_search (Boolean) (defaults to: false)

    If non-zero, perform feature extraction but don’t do any recognition yet. This may be necessary if your processor has trouble doing recognition in real-time.

  • full_utt (Boolean) (defaults to: false)

    If non-zero, this block of data is a full utterance worth of data. This may allow the recognizer to produce more accurate results.

Returns:

  • Number of frames of data searched



63
64
65
# File 'lib/pocketsphinx/decoder.rb', line 63

def process_raw(buffer, size, no_search = false, full_utt = false)
  api_call :ps_process_raw, ps_decoder, buffer, size, no_search ? 1 : 0, full_utt ? 1 : 0
end

#ps_decoderObject



131
132
133
134
# File 'lib/pocketsphinx/decoder.rb', line 131

def ps_decoder
  init_decoder if @ps_decoder.nil?
  @ps_decoder
end

#reconfigure(configuration = nil) ⇒ Object

Reinitialize the decoder with updated configuration.

This function allows you to switch the acoustic model, dictionary, or other configuration without creating an entirely new decoding object.

Parameters:

  • configuration (Configuration) (defaults to: nil)

    An optional new configuration to use. If this is nil, the previous configuration will be reloaded, with any changes applied.



14
15
16
17
# File 'lib/pocketsphinx/decoder.rb', line 14

def reconfigure(configuration = nil)
  self.configuration = configuration if configuration
  reinit_decoder
end

#set_jsgf_string(jsgf_string, name = 'default') ⇒ Object

Adds new search using JSGF model.

Convenience method to parse JSGF model from string and create a search.

Parameters:

  • jsgf_string (String)

    The JSGF grammar

  • name (String) (defaults to: 'default')

    The search name



102
103
104
# File 'lib/pocketsphinx/decoder.rb', line 102

def set_jsgf_string(jsgf_string, name = 'default')
  api_call :ps_set_jsgf_string, ps_decoder, name, jsgf_string
end

#set_search(name = 'default') ⇒ Object

Actives search with the provided name.

Activates search with the provided name. The search must be added before using either ps_set_fsg(), ps_set_lm() or ps_set_kws().



115
116
117
# File 'lib/pocketsphinx/decoder.rb', line 115

def set_search(name = 'default')
  api_call :ps_set_search, ps_decoder, name
end

#start_utterance(name = nil) ⇒ Object

Start utterance processing.

This function should be called before any utterance data is passed to the decoder. It marks the start of a new utterance and reinitializes internal data structures.

Parameters:

  • name (String) (defaults to: nil)

    String uniquely identifying this utterance. If nil, one will be created.



74
75
76
# File 'lib/pocketsphinx/decoder.rb', line 74

def start_utterance(name = nil)
  api_call :ps_start_utt, ps_decoder, name
end

#unset_search(name = 'default') ⇒ Object

Unsets the search and releases related resources.

Unsets the search previously added with using either ps_set_fsg(), ps_set_lm() or ps_set_kws().



123
124
125
# File 'lib/pocketsphinx/decoder.rb', line 123

def unset_search(name = 'default')
  api_call :ps_unset_search, ps_decoder, name
end