Class: Pocketsphinx::Decoder
- Inherits:
-
Object
- Object
- Pocketsphinx::Decoder
- Includes:
- API::CallHelpers
- Defined in:
- lib/pocketsphinx/decoder.rb
Defined Under Namespace
Classes: Hypothesis, Word
Instance Attribute Summary collapse
-
#configuration ⇒ Object
Returns the value of attribute configuration.
- #ps_api ⇒ Object
Instance Method Summary collapse
-
#decode(audio_path_or_file, max_samples = 2048) ⇒ Object
Decode a raw audio stream as a single utterance, opening a file if path given.
-
#decode_raw(audio_file, max_samples = 2048) ⇒ Object
Decode a raw audio stream as a single utterance.
-
#end_utterance ⇒ Object
End utterance processing.
-
#get_search ⇒ Object
Returns name of curent search in decoder.
-
#hypothesis ⇒ Hypothesis
Get hypothesis string (with #path_score and #utterance_id).
-
#in_speech? ⇒ Boolean
Checks if the last feed audio buffer contained speech.
-
#initialize(configuration, ps_decoder = nil) ⇒ Decoder
constructor
Initialize a Decoder.
-
#process_raw(buffer, size, no_search = false, full_utt = false) ⇒ Object
Decode raw audio data.
- #ps_decoder ⇒ Object
-
#reconfigure(configuration = nil) ⇒ Object
Reinitialize the decoder with updated configuration.
-
#set_jsgf_string(jsgf_string, name = 'default') ⇒ Object
Adds new search using JSGF model.
-
#set_search(name = 'default') ⇒ Object
Actives search with the provided name.
-
#start_utterance ⇒ Object
Start utterance processing.
-
#unset_search(name = 'default') ⇒ Object
Unsets the search and releases related resources.
-
#words ⇒ Array
Get an array of words with start/end frame values (10msec/frame) for current hypothesis.
Methods included from API::CallHelpers
Constructor Details
#initialize(configuration, ps_decoder = nil) ⇒ Decoder
Initialize a Decoder
Note that this initialization process actually updates the Configuration based on settings which are found in feat.params along with the acoustic model.
28 29 30 31 |
# File 'lib/pocketsphinx/decoder.rb', line 28 def initialize(configuration, ps_decoder = nil) @configuration = configuration init_decoder if ps_decoder.nil? end |
Instance Attribute Details
#configuration ⇒ Object
Returns the value of attribute configuration.
19 20 21 |
# File 'lib/pocketsphinx/decoder.rb', line 19 def configuration @configuration end |
#ps_api ⇒ Object
181 182 183 |
# File 'lib/pocketsphinx/decoder.rb', line 181 def ps_api @ps_api || API::Pocketsphinx end |
Instance Method Details
#decode(audio_path_or_file, max_samples = 2048) ⇒ Object
Decode a raw audio stream as a single utterance, opening a file if path given
See #decode_raw
51 52 53 54 55 56 57 58 |
# File 'lib/pocketsphinx/decoder.rb', line 51 def decode(audio_path_or_file, max_samples = 2048) case audio_path_or_file when String File.open(audio_path_or_file, 'rb') { |f| decode_raw(f, max_samples) } else decode_raw(audio_path_or_file, max_samples) end end |
#decode_raw(audio_file, max_samples = 2048) ⇒ Object
Decode a raw audio stream as a single utterance.
No headers are recognized in this files. The configuration parameters samprate and input_endian are used to determine the sampling rate and endianness of the stream, respectively. Audio is always assumed to be 16-bit signed PCM.
68 69 70 71 72 73 74 75 76 77 78 79 |
# File 'lib/pocketsphinx/decoder.rb', line 68 def decode_raw(audio_file, max_samples = 2048) start_utterance FFI::MemoryPointer.new(:int16, max_samples) do |buffer| while data = audio_file.read(max_samples * 2) buffer.write_string(data) process_raw(buffer, data.length / 2) end end end_utterance end |
#end_utterance ⇒ Object
End utterance processing
103 104 105 |
# File 'lib/pocketsphinx/decoder.rb', line 103 def end_utterance api_call :ps_end_utt, ps_decoder end |
#get_search ⇒ Object
Returns name of curent search in decoder
161 162 163 |
# File 'lib/pocketsphinx/decoder.rb', line 161 def get_search ps_api.ps_get_search(ps_decoder) end |
#hypothesis ⇒ Hypothesis
Get hypothesis string (with #path_score and #utterance_id).
115 116 117 118 119 120 121 122 123 124 |
# File 'lib/pocketsphinx/decoder.rb', line 115 def hypothesis mp_path_score = FFI::MemoryPointer.new(:int32, 1) hypothesis = ps_api.ps_get_hyp(ps_decoder, mp_path_score) hypothesis.nil? ? nil : Hypothesis.new( hypothesis, mp_path_score.get_int32(0) ) end |
#in_speech? ⇒ Boolean
Checks if the last feed audio buffer contained speech
108 109 110 |
# File 'lib/pocketsphinx/decoder.rb', line 108 def in_speech? ps_api.ps_get_in_speech(ps_decoder) != 0 end |
#process_raw(buffer, size, no_search = false, full_utt = false) ⇒ Object
Decode raw audio data.
89 90 91 |
# File 'lib/pocketsphinx/decoder.rb', line 89 def process_raw(buffer, size, no_search = false, full_utt = false) api_call :ps_process_raw, ps_decoder, buffer, size, no_search ? 1 : 0, full_utt ? 1 : 0 end |
#ps_decoder ⇒ Object
185 186 187 188 |
# File 'lib/pocketsphinx/decoder.rb', line 185 def ps_decoder init_decoder if @ps_decoder.nil? @ps_decoder end |
#reconfigure(configuration = nil) ⇒ Object
Reinitialize the decoder with updated configuration.
This function allows you to switch the acoustic model, dictionary, or other configuration without creating an entirely new decoding object.
40 41 42 43 |
# File 'lib/pocketsphinx/decoder.rb', line 40 def reconfigure(configuration = nil) self.configuration = configuration if configuration reinit_decoder end |
#set_jsgf_string(jsgf_string, name = 'default') ⇒ Object
Adds new search using JSGF model.
Convenience method to parse JSGF model from string and create a search.
156 157 158 |
# File 'lib/pocketsphinx/decoder.rb', line 156 def set_jsgf_string(jsgf_string, name = 'default') api_call :ps_set_jsgf_string, ps_decoder, name, jsgf_string end |
#set_search(name = 'default') ⇒ Object
Actives search with the provided name.
Activates search with the provided name. The search must be added before using either ps_set_fsg(), ps_set_lm() or ps_set_kws().
169 170 171 |
# File 'lib/pocketsphinx/decoder.rb', line 169 def set_search(name = 'default') api_call :ps_set_search, ps_decoder, name end |
#start_utterance ⇒ Object
Start utterance processing.
This function should be called before any utterance data is passed to the decoder. It marks the start of a new utterance and reinitializes internal data structures.
98 99 100 |
# File 'lib/pocketsphinx/decoder.rb', line 98 def start_utterance api_call :ps_start_utt, ps_decoder end |
#unset_search(name = 'default') ⇒ Object
Unsets the search and releases related resources.
Unsets the search previously added with using either ps_set_fsg(), ps_set_lm() or ps_set_kws().
177 178 179 |
# File 'lib/pocketsphinx/decoder.rb', line 177 def unset_search(name = 'default') api_call :ps_unset_search, ps_decoder, name end |
#words ⇒ Array
Get an array of words with start/end frame values (10msec/frame) for current hypothesis
129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 |
# File 'lib/pocketsphinx/decoder.rb', line 129 def words mp_path_score = FFI::MemoryPointer.new(:int32, 1) start_frame = FFI::MemoryPointer.new(:int32, 1) end_frame = FFI::MemoryPointer.new(:int32, 1) seg_iter = ps_api.ps_seg_iter(ps_decoder, mp_path_score) words = [] until seg_iter.null? do ps_api.ps_seg_frames(seg_iter, start_frame, end_frame) words << Pocketsphinx::Decoder::Word.new( ps_api.ps_seg_word(seg_iter), start_frame.get_int32(0), end_frame.get_int32(0) ) seg_iter = ps_api.ps_seg_next(seg_iter) end words end |