Class: UV::BufferedTokenizer
- Inherits:
-
Object
- Object
- UV::BufferedTokenizer
- Defined in:
- lib/uv-rays/buffered_tokenizer.rb
Constant Summary collapse
- DEFAULT_ENCODING =
'ASCII-8BIT'
Instance Attribute Summary collapse
-
#delimiter ⇒ Object
Returns the value of attribute delimiter.
-
#indicator ⇒ Object
Returns the value of attribute indicator.
-
#size_limit ⇒ Object
Returns the value of attribute size_limit.
-
#verbose ⇒ Object
Returns the value of attribute verbose.
Instance Method Summary collapse
- #bytesize ⇒ Integer
- #empty? ⇒ Boolean
-
#extract(data) ⇒ Object
Extract takes an arbitrary string of input data and returns an array of tokenized entities, provided there were any available to extract.
-
#flush ⇒ String
Flush the contents of the input buffer, i.e.
-
#initialize(options) ⇒ BufferedTokenizer
constructor
A new instance of BufferedTokenizer.
Constructor Details
#initialize(options) ⇒ BufferedTokenizer
Returns a new instance of BufferedTokenizer.
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
# File 'lib/uv-rays/buffered_tokenizer.rb', line 25 def initialize() @delimiter = [:delimiter] @indicator = [:indicator] @msg_length = [:msg_length] @size_limit = [:size_limit] @min_length = [:min_length] || 1 @verbose = [:verbose] if @size_limit @encoding = [:encoding] || DEFAULT_ENCODING if @delimiter @extract_method = method(:delimiter_extract) elsif @indicator && @msg_length @extract_method = method(:length_extract) else raise ArgumentError, 'no delimiter provided' end init_buffer end |
Instance Attribute Details
#delimiter ⇒ Object
Returns the value of attribute delimiter.
22 23 24 |
# File 'lib/uv-rays/buffered_tokenizer.rb', line 22 def delimiter @delimiter end |
#indicator ⇒ Object
Returns the value of attribute indicator.
22 23 24 |
# File 'lib/uv-rays/buffered_tokenizer.rb', line 22 def indicator @indicator end |
#size_limit ⇒ Object
Returns the value of attribute size_limit.
22 23 24 |
# File 'lib/uv-rays/buffered_tokenizer.rb', line 22 def size_limit @size_limit end |
#verbose ⇒ Object
Returns the value of attribute verbose.
22 23 24 |
# File 'lib/uv-rays/buffered_tokenizer.rb', line 22 def verbose @verbose end |
Instance Method Details
#bytesize ⇒ Integer
77 78 79 |
# File 'lib/uv-rays/buffered_tokenizer.rb', line 77 def bytesize @input.bytesize end |
#empty? ⇒ Boolean
72 73 74 |
# File 'lib/uv-rays/buffered_tokenizer.rb', line 72 def empty? @input.empty? end |
#extract(data) ⇒ Object
Extract takes an arbitrary string of input data and returns an array of tokenized entities, provided there were any available to extract.
54 55 56 57 58 59 |
# File 'lib/uv-rays/buffered_tokenizer.rb', line 54 def extract(data) data.force_encoding(@encoding) @input << data @extract_method.call end |
#flush ⇒ String
Flush the contents of the input buffer, i.e. return the input buffer even though a token has not yet been encountered.
65 66 67 68 69 |
# File 'lib/uv-rays/buffered_tokenizer.rb', line 65 def flush buffer = @input reset buffer end |