Class: Spellr::LineTokenizer
- Inherits:
-
StringScanner
- Object
- StringScanner
- Spellr::LineTokenizer
- Includes:
- TokenRegexps
- Defined in:
- lib/spellr/line_tokenizer.rb
Constant Summary
Constants included from TokenRegexps
TokenRegexps::AFTER_KEY_SKIPS, TokenRegexps::ALPHA_SEP_RE, TokenRegexps::BACKSLASH_ESCAPE_RE, TokenRegexps::HEX_RE, TokenRegexps::KEY_DATA_URL, TokenRegexps::KEY_GTM_RE, TokenRegexps::KEY_HYPERWALLET_RE, TokenRegexps::KEY_PATTERNS_RE, TokenRegexps::KEY_SENDGRID_RE, TokenRegexps::KEY_SHA1, TokenRegexps::KEY_SHA512, TokenRegexps::LEFTOVER_NON_WORD_BITS_RE, TokenRegexps::LOWER_CASE_RE, TokenRegexps::NOT_EVEN_NON_WORDS_RE, TokenRegexps::NUM_SEP_RE, TokenRegexps::OTHER_CASE_RE, TokenRegexps::POSSIBLE_KEY_RE, TokenRegexps::PUNYCODE_RE, TokenRegexps::REPEATED_SINGLE_LETTERS_RE, TokenRegexps::SEQUENTIAL_LETTERS_RE, TokenRegexps::SHELL_COLOR_ESCAPE_RE, TokenRegexps::SKIPS, TokenRegexps::SPELLR_DISABLE_RE, TokenRegexps::SPELLR_ENABLE_RE, TokenRegexps::TERM_RE, TokenRegexps::THREE_CHUNK_RE, TokenRegexps::TITLE_CASE_RE, TokenRegexps::UPPER_CASE_RE, TokenRegexps::URL_ENCODED_ENTITIES_RE, TokenRegexps::URL_FRAGMENT, TokenRegexps::URL_HOSTNAME, TokenRegexps::URL_IP_ADDRESS, TokenRegexps::URL_PATH, TokenRegexps::URL_PORT, TokenRegexps::URL_QUERY, TokenRegexps::URL_RE, TokenRegexps::URL_REST, TokenRegexps::URL_SCHEME, TokenRegexps::URL_USERINFO
Instance Attribute Summary collapse
-
#disabled ⇒ Object
(also: #disabled?)
Returns the value of attribute disabled.
-
#line ⇒ Object
readonly
Returns the value of attribute line.
-
#skip_key ⇒ Object
(also: #skip_key?)
Returns the value of attribute skip_key.
Instance Method Summary collapse
-
#charpos=(new_charpos) ⇒ Object
jump to character-aware position TODO: handle jump backward.
- #each_term ⇒ Object
-
#each_token(skip_term_proc: nil) ⇒ Object
rubocop:disable Metrics/MethodLength.
-
#initialize(line, skip_key: true) ⇒ LineTokenizer
constructor
A new instance of LineTokenizer.
- #string=(line) ⇒ Object
Methods included from TokenRegexps
Constructor Details
#initialize(line, skip_key: true) ⇒ LineTokenizer
Returns a new instance of LineTokenizer.
20 21 22 23 24 25 |
# File 'lib/spellr/line_tokenizer.rb', line 20 def initialize(line, skip_key: true) @line = line @skip_key = skip_key super(@line.to_s) end |
Instance Attribute Details
#disabled ⇒ Object Also known as: disabled?
Returns the value of attribute disabled.
13 14 15 |
# File 'lib/spellr/line_tokenizer.rb', line 13 def disabled @disabled end |
#line ⇒ Object (readonly)
Returns the value of attribute line.
12 13 14 |
# File 'lib/spellr/line_tokenizer.rb', line 12 def line @line end |
#skip_key ⇒ Object Also known as: skip_key?
Returns the value of attribute skip_key.
15 16 17 |
# File 'lib/spellr/line_tokenizer.rb', line 15 def skip_key @skip_key end |
Instance Method Details
#charpos=(new_charpos) ⇒ Object
jump to character-aware position TODO: handle jump backward
53 54 55 |
# File 'lib/spellr/line_tokenizer.rb', line 53 def charpos=(new_charpos) skip(/.{#{new_charpos - charpos}}/m) end |
#each_term ⇒ Object
32 33 34 35 36 37 38 39 |
# File 'lib/spellr/line_tokenizer.rb', line 32 def each_term until eos? term = next_term next if !term || disabled? yield term end end |
#each_token(skip_term_proc: nil) ⇒ Object
rubocop:disable Metrics/MethodLength
41 42 43 44 45 46 47 48 49 |
# File 'lib/spellr/line_tokenizer.rb', line 41 def each_token(skip_term_proc: nil) # rubocop:disable Metrics/MethodLength until eos? term = next_term next unless term next if disabled? || skip_term_proc&.call(term) yield Token.new(term, line: line, location: column_location(term)) end end |
#string=(line) ⇒ Object
27 28 29 30 |
# File 'lib/spellr/line_tokenizer.rb', line 27 def string=(line) @line = line super(@line.to_s) end |