Class: JMESPath::Lexer Private

Inherits:
Object
  • Object
show all
Defined in:
lib/jmespath/lexer.rb

This class is part of a private API. You should avoid using this class if possible, as it may be removed or be changed in the future.

Defined Under Namespace

Classes: CharacterStream

Constant Summary collapse

T_DOT =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:dot
T_STAR =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:star
T_COMMA =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:comma
T_COLON =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:colon
T_CURRENT =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:current
T_EXPREF =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:expref
T_LPAREN =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:lparen
T_RPAREN =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:rparen
T_LBRACE =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:lbrace
T_RBRACE =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:rbrace
T_LBRACKET =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:lbracket
T_RBRACKET =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:rbracket
T_FLATTEN =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:flatten
T_IDENTIFIER =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:identifier
T_NUMBER =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:number
T_QUOTED_IDENTIFIER =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:quoted_identifier
T_UNKNOWN =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:unknown
T_PIPE =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:pipe
T_OR =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:or
T_AND =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:and
T_NOT =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:not
T_FILTER =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:filter
T_LITERAL =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:literal
T_EOF =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:eof
T_COMPARATOR =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

:comparator
STATE_IDENTIFIER =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

0
STATE_NUMBER =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

1
STATE_SINGLE_CHAR =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

2
STATE_WHITESPACE =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

3
STATE_STRING_LITERAL =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

4
STATE_QUOTED_STRING =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

5
STATE_JSON_LITERAL =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

6
STATE_LBRACKET =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

7
STATE_PIPE =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

8
STATE_LT =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

9
STATE_GT =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

10
STATE_EQ =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

11
STATE_NOT =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

12
STATE_AND =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

13
TRANSLATION_TABLE =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

{
  '<'  => STATE_LT,
  '>'  => STATE_GT,
  '='  => STATE_EQ,
  '!'  => STATE_NOT,
  '['  => STATE_LBRACKET,
  '|'  => STATE_PIPE,
  '&'  => STATE_AND,
  '`'  => STATE_JSON_LITERAL,
  '"'  => STATE_QUOTED_STRING,
  "'"  => STATE_STRING_LITERAL,
  '-'  => STATE_NUMBER,
  '0'  => STATE_NUMBER,
  '1'  => STATE_NUMBER,
  '2'  => STATE_NUMBER,
  '3'  => STATE_NUMBER,
  '4'  => STATE_NUMBER,
  '5'  => STATE_NUMBER,
  '6'  => STATE_NUMBER,
  '7'  => STATE_NUMBER,
  '8'  => STATE_NUMBER,
  '9'  => STATE_NUMBER,
  ' '  => STATE_WHITESPACE,
  "\t" => STATE_WHITESPACE,
  "\n" => STATE_WHITESPACE,
  "\r" => STATE_WHITESPACE,
  '.'  => STATE_SINGLE_CHAR,
  '*'  => STATE_SINGLE_CHAR,
  ']'  => STATE_SINGLE_CHAR,
  ','  => STATE_SINGLE_CHAR,
  ':'  => STATE_SINGLE_CHAR,
  '@'  => STATE_SINGLE_CHAR,
  '('  => STATE_SINGLE_CHAR,
  ')'  => STATE_SINGLE_CHAR,
  '{'  => STATE_SINGLE_CHAR,
  '}'  => STATE_SINGLE_CHAR,
  '_'  => STATE_IDENTIFIER,
  'A'  => STATE_IDENTIFIER,
  'B'  => STATE_IDENTIFIER,
  'C'  => STATE_IDENTIFIER,
  'D'  => STATE_IDENTIFIER,
  'E'  => STATE_IDENTIFIER,
  'F'  => STATE_IDENTIFIER,
  'G'  => STATE_IDENTIFIER,
  'H'  => STATE_IDENTIFIER,
  'I'  => STATE_IDENTIFIER,
  'J'  => STATE_IDENTIFIER,
  'K'  => STATE_IDENTIFIER,
  'L'  => STATE_IDENTIFIER,
  'M'  => STATE_IDENTIFIER,
  'N'  => STATE_IDENTIFIER,
  'O'  => STATE_IDENTIFIER,
  'P'  => STATE_IDENTIFIER,
  'Q'  => STATE_IDENTIFIER,
  'R'  => STATE_IDENTIFIER,
  'S'  => STATE_IDENTIFIER,
  'T'  => STATE_IDENTIFIER,
  'U'  => STATE_IDENTIFIER,
  'V'  => STATE_IDENTIFIER,
  'W'  => STATE_IDENTIFIER,
  'X'  => STATE_IDENTIFIER,
  'Y'  => STATE_IDENTIFIER,
  'Z'  => STATE_IDENTIFIER,
  'a'  => STATE_IDENTIFIER,
  'b'  => STATE_IDENTIFIER,
  'c'  => STATE_IDENTIFIER,
  'd'  => STATE_IDENTIFIER,
  'e'  => STATE_IDENTIFIER,
  'f'  => STATE_IDENTIFIER,
  'g'  => STATE_IDENTIFIER,
  'h'  => STATE_IDENTIFIER,
  'i'  => STATE_IDENTIFIER,
  'j'  => STATE_IDENTIFIER,
  'k'  => STATE_IDENTIFIER,
  'l'  => STATE_IDENTIFIER,
  'm'  => STATE_IDENTIFIER,
  'n'  => STATE_IDENTIFIER,
  'o'  => STATE_IDENTIFIER,
  'p'  => STATE_IDENTIFIER,
  'q'  => STATE_IDENTIFIER,
  'r'  => STATE_IDENTIFIER,
  's'  => STATE_IDENTIFIER,
  't'  => STATE_IDENTIFIER,
  'u'  => STATE_IDENTIFIER,
  'v'  => STATE_IDENTIFIER,
  'w'  => STATE_IDENTIFIER,
  'x'  => STATE_IDENTIFIER,
  'y'  => STATE_IDENTIFIER,
  'z'  => STATE_IDENTIFIER,
}
VALID_IDENTIFIERS =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

Set.new(%w(
  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
  a b c d e f g h i j k l m n o p q r s t u v w x y z
  _ 0 1 2 3 4 5 6 7 8 9
))
NUMBERS =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

Set.new(%w(0 1 2 3 4 5 6 7 8 9))
SIMPLE_TOKENS =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

{
  '.' => T_DOT,
  '*' => T_STAR,
  ']' => T_RBRACKET,
  ',' => T_COMMA,
  ':' => T_COLON,
  '@' => T_CURRENT,
  '(' => T_LPAREN,
  ')' => T_RPAREN,
  '{' => T_LBRACE,
  '}' => T_RBRACE,
}

Instance Method Summary collapse

Instance Method Details

#tokenize(expression) ⇒ Array<Hash>

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Parameters:

Returns:

  • (Array<Hash>)


163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
# File 'lib/jmespath/lexer.rb', line 163

def tokenize(expression)

  tokens = []
  chars = CharacterStream.new(expression.chars.to_a)

  while chars.current
    case TRANSLATION_TABLE[chars.current]
    when nil
      tokens << Token.new(
        T_UNKNOWN,
        chars.current,
        chars.position
      )
      chars.next
    when STATE_SINGLE_CHAR
      # consume simple tokens like ".", ",", "@", etc.
      tokens << Token.new(
        SIMPLE_TOKENS[chars.current],
        chars.current,
        chars.position
      )
      chars.next
    when STATE_IDENTIFIER
      start = chars.position
      buffer = []
      begin
        buffer << chars.current
        chars.next
      end while VALID_IDENTIFIERS.include?(chars.current)
      tokens << Token.new(
        T_IDENTIFIER,
        buffer.join,
        start
      )
    when STATE_WHITESPACE
      # skip whitespace
      chars.next
    when STATE_LBRACKET
      # consume "[", "[?" and "[]"
      position = chars.position
      actual = chars.next
      if actual == ']'
        chars.next
        tokens << Token.new(T_FLATTEN, '[]', position)
      elsif actual == '?'
        chars.next
        tokens << Token.new(T_FILTER, '[?', position)
      else
        tokens << Token.new(T_LBRACKET, '[',  position)
      end
    when STATE_STRING_LITERAL
      # consume raw string literals
      t = inside(chars, "'", T_LITERAL)
      t.value = t.value.gsub("\\'", "'")
      tokens << t
    when STATE_PIPE
      # consume pipe and OR
      tokens << match_or(chars, '|', '|', T_OR, T_PIPE)
    when STATE_JSON_LITERAL
      # consume JSON literals
      token = inside(chars, '`', T_LITERAL)
      if token.type == T_LITERAL
        token.value = token.value.gsub('\\`', '`')
        token = parse_json(token)
      end
      tokens << token
    when STATE_NUMBER
      start = chars.position
      buffer = []
      begin
        buffer << chars.current
        chars.next
      end while NUMBERS.include?(chars.current)
      tokens << Token.new(
        T_NUMBER,
        buffer.join.to_i,
        start
      )
    when STATE_QUOTED_STRING
      # consume quoted identifiers
      token = inside(chars, '"', T_QUOTED_IDENTIFIER)
      if token.type == T_QUOTED_IDENTIFIER
        token.value = "\"#{token.value}\""
        token = parse_json(token, true)
      end
      tokens << token
    when STATE_EQ
      # consume equals
      tokens << match_or(chars, '=', '=', T_COMPARATOR, T_UNKNOWN)
    when STATE_AND
      tokens << match_or(chars, '&', '&', T_AND, T_EXPREF)
    when STATE_NOT
      # consume not equals
      tokens << match_or(chars, '!', '=', T_COMPARATOR, T_NOT);
    else
      # either '<' or '>'
      # consume less than and greater than
      tokens << match_or(chars, chars.current, '=', T_COMPARATOR, T_COMPARATOR)
    end
  end
  tokens << Token.new(T_EOF, nil, chars.position)
  tokens
end