Class: UV::BufferedTokenizer
- Inherits:
-
Object
- Object
- UV::BufferedTokenizer
- Defined in:
- lib/uv-rays/buffered_tokenizer.rb
Constant Summary collapse
- DEFAULT_ENCODING =
'ASCII-8BIT'.freeze
Instance Attribute Summary collapse
-
#delimiter ⇒ Object
Returns the value of attribute delimiter.
-
#indicator ⇒ Object
Returns the value of attribute indicator.
-
#size_limit ⇒ Object
Returns the value of attribute size_limit.
-
#verbose ⇒ Object
Returns the value of attribute verbose.
Instance Method Summary collapse
- #empty? ⇒ Boolean
-
#extract(data) ⇒ Object
Extract takes an arbitrary string of input data and returns an array of tokenized entities, provided there were any available to extract.
-
#flush ⇒ String
Flush the contents of the input buffer, i.e.
-
#initialize(options) ⇒ BufferedTokenizer
constructor
A new instance of BufferedTokenizer.
Constructor Details
#initialize(options) ⇒ BufferedTokenizer
Returns a new instance of BufferedTokenizer.
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
# File 'lib/uv-rays/buffered_tokenizer.rb', line 23 def initialize() @delimiter = [:delimiter] @indicator = [:indicator] @msg_length = [:msg_length] @size_limit = [:size_limit] @min_length = [:min_length] || 1 @verbose = [:verbose] if @size_limit @encoding = [:encoding] || DEFAULT_ENCODING if @delimiter @delimiter.force_encoding(@encoding) if @delimiter.is_a?(String) @indicator.force_encoding(@encoding) if @indicator.is_a?(String) @extract_method = method(:delimiter_extract) elsif @indicator && @msg_length @indicator.force_encoding(@encoding) if @indicator.is_a?(String) @extract_method = method(:length_extract) else raise ArgumentError, 'no delimiter provided' end init_buffer end |
Instance Attribute Details
#delimiter ⇒ Object
Returns the value of attribute delimiter.
20 21 22 |
# File 'lib/uv-rays/buffered_tokenizer.rb', line 20 def delimiter @delimiter end |
#indicator ⇒ Object
Returns the value of attribute indicator.
20 21 22 |
# File 'lib/uv-rays/buffered_tokenizer.rb', line 20 def indicator @indicator end |
#size_limit ⇒ Object
Returns the value of attribute size_limit.
20 21 22 |
# File 'lib/uv-rays/buffered_tokenizer.rb', line 20 def size_limit @size_limit end |
#verbose ⇒ Object
Returns the value of attribute verbose.
20 21 22 |
# File 'lib/uv-rays/buffered_tokenizer.rb', line 20 def verbose @verbose end |
Instance Method Details
#empty? ⇒ Boolean
73 74 75 |
# File 'lib/uv-rays/buffered_tokenizer.rb', line 73 def empty? @input.empty? end |
#extract(data) ⇒ Object
Extract takes an arbitrary string of input data and returns an array of tokenized entities, provided there were any available to extract.
55 56 57 58 59 60 |
# File 'lib/uv-rays/buffered_tokenizer.rb', line 55 def extract(data) data.force_encoding(@encoding) @input << data @extract_method.call end |
#flush ⇒ String
Flush the contents of the input buffer, i.e. return the input buffer even though a token has not yet been encountered.
66 67 68 69 70 |
# File 'lib/uv-rays/buffered_tokenizer.rb', line 66 def flush buffer = @input reset buffer end |