Class: Tailor::Lexer

Inherits:
Ripper::Lexer
  • Object
show all
Includes:
LogSwitch::Mixin, CompositeObservable, LexerConstants
Defined in:
lib/tailor/lexer.rb,
lib/tailor/lexer/token.rb

Overview

This is what provides the main file parsing for tailor. For every event that’s encountered, it calls the appropriate notifier method. Notifier methods are provided by CompositeObservable.

Defined Under Namespace

Classes: Token

Constant Summary

Constants included from LexerConstants

Tailor::LexerConstants::CONTINUATION_KEYWORDS, Tailor::LexerConstants::KEYWORDS_AND_MODIFIERS, Tailor::LexerConstants::KEYWORDS_TO_INDENT, Tailor::LexerConstants::LOOP_KEYWORDS, Tailor::LexerConstants::MODIFIERS, Tailor::LexerConstants::MULTILINE_OPERATORS

Instance Method Summary collapse

Methods included from CompositeObservable

define_observer

Constructor Details

#initialize(file) ⇒ Lexer

Returns a new instance of Lexer.

Parameters:

  • file (String)

    The string to lex, or name of the file to read and analyze.



21
22
23
24
25
26
27
28
29
30
31
32
33
34
# File 'lib/tailor/lexer.rb', line 21

def initialize(file)
  @original_file_text = if File.exists? file
    @file_name = file
    File.open(@file_name, 'r').read
  else
    @file_name = '<notafile>'
    file
  end

  @file_text = ensure_trailing_newline(@original_file_text)
  @file_text = sub_line_ending_backslashes(@file_text)
  super @file_text
  @added_newline = @file_text != @original_file_text
end

Instance Method Details

#count_trailing_newlines(text) ⇒ Fixnum

Counts the number of newlines at the end of the file.

Parameters:

  • text (String)

    The file’s text.

Returns:

  • (Fixnum)

    The number of n at the end of the file.



526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
# File 'lib/tailor/lexer.rb', line 526

def count_trailing_newlines(text)
  if text.end_with? "\n"
    count = 0

    text.reverse.chars do |c|
      if c == "\n"
        count += 1
      else
        break
      end
    end

    count
  else
    0
  end
end

#current_line_of_textString

The current line of text being examined.

Returns:

  • (String)

    The current line of text.



518
519
520
# File 'lib/tailor/lexer.rb', line 518

def current_line_of_text
  @file_text.split("\n").at(lineno - 1) || ''
end

#ensure_trailing_newline(file_text) ⇒ String

Adds a newline to the end of the test if one doesn’t exist. Without doing this, Ripper won’t trigger a newline event for the last line of the file, which is required for some rulers to do their thing.

Parameters:

  • file_text (String)

    The text to check.

Returns:

  • (String)

    The file text with a newline at the end.



550
551
552
# File 'lib/tailor/lexer.rb', line 550

def ensure_trailing_newline(file_text)
  count_trailing_newlines(file_text) > 0 ? file_text : (file_text + "\n")
end

#lexObject

This kicks off the process of parsing the file and publishing events as the events are discovered.



38
39
40
41
42
43
44
45
46
# File 'lib/tailor/lexer.rb', line 38

def lex
  file_beg_changed
  notify_file_beg_observers(@file_name)

  super

  file_end_changed
  notify_file_end_observers(count_trailing_newlines(@original_file_text))
end

#on___end__(token) ⇒ Object

Called when the lexer matches __END__.

Parameters:

  • token (String)

    The token that the lexer matched.



502
503
504
505
# File 'lib/tailor/lexer.rb', line 502

def on___end__(token)
  log "__END__: '#{token}'"
  super(token)
end

#on_backref(token) ⇒ Object



48
49
50
51
# File 'lib/tailor/lexer.rb', line 48

def on_backref(token)
  log "BACKREF: '#{token}'"
  super(token)
end

#on_backtick(token) ⇒ Object

Called when the lexer matches the first ‘ in a “ statement (the second matches :on_tstring_end; this may or may not be a Ruby bug).

Parameters:

  • token (String)

    The token that the lexer matched.



57
58
59
60
# File 'lib/tailor/lexer.rb', line 57

def on_backtick(token)
  log "BACKTICK: '#{token}'"
  super(token)
end

#on_CHAR(token) ⇒ Object

Called when the lexer matches CHAR.

Parameters:

  • token (String)

    The token that the lexer matched.



510
511
512
513
# File 'lib/tailor/lexer.rb', line 510

def on_CHAR(token)
  log "CHAR: '#{token}'"
  super(token)
end

#on_comma(token) ⇒ Object

Called when the lexer matches a comma.

Parameters:

  • token (String)

    The token that the lexer matched.



65
66
67
68
69
70
71
72
73
# File 'lib/tailor/lexer.rb', line 65

def on_comma(token)
  log "COMMA: #{token}"
  log "Line length: #{current_line_of_text.length}"

  comma_changed
  notify_comma_observers(current_line_of_text, lineno, column)

  super(token)
end

#on_comment(token) ⇒ Object

Called when the lexer matches a #. The token includes the # as well as the content after it.

Parameters:

  • token (String)

    The token that the lexer matched.



79
80
81
82
83
84
85
86
87
88
# File 'lib/tailor/lexer.rb', line 79

def on_comment(token)
  log "COMMENT: '#{token}'"

  l_token = Tailor::Lexer::Token.new(token)
  lexed_line = LexedLine.new(super, lineno)
  comment_changed
  notify_comment_observers(l_token, lexed_line, @file_text, lineno, column)

  super(token)
end

#on_const(token) ⇒ Object

Called when the lexer matches a constant (including class names, of course).

Parameters:

  • token (String)

    The token that the lexer matched.



94
95
96
97
98
99
100
101
102
103
# File 'lib/tailor/lexer.rb', line 94

def on_const(token)
  log "CONST: '#{token}'"

  l_token = Tailor::Lexer::Token.new(token)
  lexed_line = LexedLine.new(super, lineno)
  const_changed
  notify_const_observers(l_token, lexed_line, lineno, column)

  super(token)
end

#on_cvar(token) ⇒ Object

Called when the lexer matches a class variable.

Parameters:

  • token (String)

    The token that the lexer matched.



108
109
110
111
# File 'lib/tailor/lexer.rb', line 108

def on_cvar(token)
  log "CVAR: '#{token}'"
  super(token)
end

#on_embdoc(token) ⇒ Object

Called when the lexer matches the content inside a =begin/=end.

Parameters:

  • token (String)

    The token that the lexer matched.



116
117
118
119
# File 'lib/tailor/lexer.rb', line 116

def on_embdoc(token)
  log "EMBDOC: '#{token}'"
  super(token)
end

#on_embdoc_beg(token) ⇒ Object

Called when the lexer matches =begin.

Parameters:

  • token (String)

    The token that the lexer matched.



124
125
126
127
# File 'lib/tailor/lexer.rb', line 124

def on_embdoc_beg(token)
  log "EMBDOC_BEG: '#{token}'"
  super(token)
end

#on_embdoc_end(token) ⇒ Object

Called when the lexer matches =end.

Parameters:

  • token (String)

    The token that the lexer matched.



132
133
134
135
# File 'lib/tailor/lexer.rb', line 132

def on_embdoc_end(token)
  log "EMBDOC_BEG: '#{token}'"
  super(token)
end

#on_embexpr_beg(token) ⇒ Object

Called when the lexer matches a #{.

Parameters:

  • token (String)

    The token that the lexer matched.



140
141
142
143
144
145
146
# File 'lib/tailor/lexer.rb', line 140

def on_embexpr_beg(token)
  log "EMBEXPR_BEG: '#{token}'"
  current_line = LexedLine.new(super, lineno)
  embexpr_beg_changed
  notify_embexpr_beg_observers(current_line, lineno, column)
  super(token)
end

#on_embexpr_end(token) ⇒ Object

Called when the lexer matches the } that closes a #{. Note that as of MRI 1.9.3-p125, this never gets called. Logged as a bug and fixed in ruby 2.0.0-p0: bugs.ruby-lang.org/issues/6211.

Parameters:

  • token (String)

    The token that the lexer matched.



153
154
155
156
157
158
159
# File 'lib/tailor/lexer.rb', line 153

def on_embexpr_end(token)
  log "EMBEXPR_END: '#{token}'"
  current_line = LexedLine.new(super, lineno)
  embexpr_end_changed
  notify_embexpr_end_observers(current_line, lineno, column)
  super(token)
end

#on_embvar(token) ⇒ Object



161
162
163
164
# File 'lib/tailor/lexer.rb', line 161

def on_embvar(token)
  log "EMBVAR: '#{token}'"
  super(token)
end

#on_float(token) ⇒ Object

Called when the lexer matches a Float.

Parameters:

  • token (String)

    The token that the lexer matched.



169
170
171
172
# File 'lib/tailor/lexer.rb', line 169

def on_float(token)
  log "FLOAT: '#{token}'"
  super(token)
end

#on_gvar(token) ⇒ Object

Called when the lexer matches a global variable.

Parameters:

  • token (String)

    The token that the lexer matched.



177
178
179
180
# File 'lib/tailor/lexer.rb', line 177

def on_gvar(token)
  log "GVAR: '#{token}'"
  super(token)
end

#on_heredoc_beg(token) ⇒ Object

Called when the lexer matches the beginning of a heredoc.

Parameters:

  • token (String)

    The token that the lexer matched.



185
186
187
188
# File 'lib/tailor/lexer.rb', line 185

def on_heredoc_beg(token)
  log "HEREDOC_BEG: '#{token}'"
  super(token)
end

#on_heredoc_end(token) ⇒ Object

Called when the lexer matches the end of a heredoc.

Parameters:

  • token (String)

    The token that the lexer matched.



193
194
195
196
# File 'lib/tailor/lexer.rb', line 193

def on_heredoc_end(token)
  log "HEREDOC_END: '#{token}'"
  super(token)
end

#on_ident(token) ⇒ Object

Called when the lexer matches an identifier (method name, variable, the text part of a Symbol, etc.).

Parameters:

  • token (String)

    The token that the lexer matched.



202
203
204
205
206
207
208
209
# File 'lib/tailor/lexer.rb', line 202

def on_ident(token)
  log "IDENT: '#{token}'"
  l_token = Tailor::Lexer::Token.new(token)
  lexed_line = LexedLine.new(super, lineno)
  ident_changed
  notify_ident_observers(l_token, lexed_line, lineno, column)
  super(token)
end

#on_ignored_nl(token) ⇒ Object

Called when the lexer matches a Ruby ignored newline. Ignored newlines occur when a newline is encountered, but the statement that was expressed on that line was not completed on that line.

Parameters:

  • token (String)

    The token that the lexer matched.



216
217
218
219
220
221
222
223
224
# File 'lib/tailor/lexer.rb', line 216

def on_ignored_nl(token)
  log 'IGNORED_NL'

  current_line = LexedLine.new(super, lineno)
  ignored_nl_changed
  notify_ignored_nl_observers(current_line, lineno, column)

  super(token)
end

#on_int(token) ⇒ Object

Called when the lexer matches an Integer.

Parameters:

  • token (String)

    The token that the lexer matched.



229
230
231
232
# File 'lib/tailor/lexer.rb', line 229

def on_int(token)
  log "INT: '#{token}'"
  super(token)
end

#on_ivar(token) ⇒ Object

Called when the lexer matches an instance variable.

Parameters:

  • token (String)

    The token that the lexer matched.



237
238
239
240
# File 'lib/tailor/lexer.rb', line 237

def on_ivar(token)
  log "IVAR: '#{token}'"
  super(token)
end

#on_kw(token) ⇒ Object

Called when the lexer matches a Ruby keyword.

Parameters:

  • token (String)

    The token that the lexer matched.



245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
# File 'lib/tailor/lexer.rb', line 245

def on_kw(token)
  log "KW: #{token}"
  current_line = LexedLine.new(super, lineno)

  l_token = Tailor::Lexer::Token.new(token,
    {
      loop_with_do: current_line.loop_with_do?,
      full_line_of_text: current_line_of_text
    }
  )

  kw_changed
  notify_kw_observers(l_token, current_line, lineno, column)

  super(token)
end

#on_label(token) ⇒ Object

Called when the lexer matches a label (the first part in a non-rocket style Hash).

Example:

one: 1     # Matches one:

Parameters:

  • token (String)

    The token that the lexer matched.



269
270
271
272
# File 'lib/tailor/lexer.rb', line 269

def on_label(token)
  log "LABEL: '#{token}'"
  super(token)
end

#on_lbrace(token) ⇒ Object

Called when the lexer matches a {. Note a #{ match calls #on_embexpr_beg.

Parameters:

  • token (String)

    The token that the lexer matched.



278
279
280
281
282
283
284
# File 'lib/tailor/lexer.rb', line 278

def on_lbrace(token)
  log "LBRACE: '#{token}'"
  current_line = LexedLine.new(super, lineno)
  lbrace_changed
  notify_lbrace_observers(current_line, lineno, column)
  super(token)
end

#on_lbracket(token) ⇒ Object

Called when the lexer matches a [.

Parameters:

  • token (String)

    The token that the lexer matched.



289
290
291
292
293
294
295
# File 'lib/tailor/lexer.rb', line 289

def on_lbracket(token)
  log "LBRACKET: '#{token}'"
  current_line = LexedLine.new(super, lineno)
  lbracket_changed
  notify_lbracket_observers(current_line, lineno, column)
  super(token)
end

#on_lparen(token) ⇒ Object

Called when the lexer matches a (.

Parameters:

  • token (String)

    The token that the lexer matched.



300
301
302
303
304
305
# File 'lib/tailor/lexer.rb', line 300

def on_lparen(token)
  log "LPAREN: '#{token}'"
  lparen_changed
  notify_lparen_observers(lineno, column)
  super(token)
end

#on_nl(token) ⇒ Object

This is the first thing that exists on a new line–NOT the last!



308
309
310
311
312
313
314
315
316
# File 'lib/tailor/lexer.rb', line 308

def on_nl(token)
  log 'NL'
  current_line = LexedLine.new(super, lineno)

  nl_changed
  notify_nl_observers(current_line, lineno, column)

  super(token)
end

#on_op(token) ⇒ Object

Called when the lexer matches an operator.

Parameters:

  • token (String)

    The token that the lexer matched.



321
322
323
324
# File 'lib/tailor/lexer.rb', line 321

def on_op(token)
  log "OP: '#{token}'"
  super(token)
end

#on_period(token) ⇒ Object

Called when the lexer matches a period.

Parameters:

  • token (String)

    The token that the lexer matched.



329
330
331
332
333
334
335
336
# File 'lib/tailor/lexer.rb', line 329

def on_period(token)
  log "PERIOD: '#{token}'"

  period_changed
  notify_period_observers(current_line_of_text.length, lineno, column)

  super(token)
end

#on_qwords_beg(token) ⇒ Object

Called when the lexer matches ‘%w’. Statement is ended by a :on_words_end.

Parameters:

  • token (String)

    The token that the lexer matched.



342
343
344
345
# File 'lib/tailor/lexer.rb', line 342

def on_qwords_beg(token)
  log "QWORDS_BEG: '#{token}'"
  super(token)
end

#on_rbrace(token) ⇒ Object

Called when the lexer matches a }.

Parameters:

  • token (String)

    The token that the lexer matched.



350
351
352
353
354
355
356
357
358
# File 'lib/tailor/lexer.rb', line 350

def on_rbrace(token)
  log "RBRACE: '#{token}'"

  current_line = LexedLine.new(super, lineno)
  rbrace_changed
  notify_rbrace_observers(current_line, lineno, column)

  super(token)
end

#on_rbracket(token) ⇒ Object

Called when the lexer matches a ].

Parameters:

  • token (String)

    The token that the lexer matched.



363
364
365
366
367
368
369
370
371
# File 'lib/tailor/lexer.rb', line 363

def on_rbracket(token)
  log "RBRACKET: '#{token}'"

  current_line = LexedLine.new(super, lineno)
  rbracket_changed
  notify_rbracket_observers(current_line, lineno, column)

  super(token)
end

#on_regexp_beg(token) ⇒ Object

Called when the lexer matches the beginning of a Regexp.

Parameters:

  • token (String)

    The token that the lexer matched.



376
377
378
379
# File 'lib/tailor/lexer.rb', line 376

def on_regexp_beg(token)
  log "REGEXP_BEG: '#{token}'"
  super(token)
end

#on_regexp_end(token) ⇒ Object

Called when the lexer matches the end of a Regexp.

Parameters:

  • token (String)

    The token that the lexer matched.



384
385
386
387
# File 'lib/tailor/lexer.rb', line 384

def on_regexp_end(token)
  log "REGEXP_END: '#{token}'"
  super(token)
end

#on_rparen(token) ⇒ Object

Called when the lexer matches a ).

Parameters:

  • token (String)

    The token that the lexer matched.



392
393
394
395
396
397
398
399
400
# File 'lib/tailor/lexer.rb', line 392

def on_rparen(token)
  log "RPAREN: '#{token}'"

  current_line = LexedLine.new(super, lineno)
  rparen_changed
  notify_rparen_observers(current_line, lineno, column)

  super(token)
end

#on_semicolon(token) ⇒ Object

Called when the lexer matches a ;.

Parameters:

  • token (String)

    The token that the lexer matched.



405
406
407
408
# File 'lib/tailor/lexer.rb', line 405

def on_semicolon(token)
  log "SEMICOLON: '#{token}'"
  super(token)
end

#on_sp(token) ⇒ Object

Called when the lexer matches any type of space character.

Parameters:

  • token (String)

    The token that the lexer matched.



413
414
415
416
417
418
419
420
421
422
423
424
425
426
# File 'lib/tailor/lexer.rb', line 413

def on_sp(token)
  log "SP: '#{token}'; size: #{token.size}"
  l_token = Tailor::Lexer::Token.new(token)
  sp_changed
  notify_sp_observers(l_token, lineno, column)

  # Deal with lines that end with \
  if token == "\\\n"
    current_line = LexedLine.new(super, lineno)
    ignored_nl_changed
    notify_ignored_nl_observers(current_line, lineno, column)
  end
  super(token)
end

#on_symbeg(token) ⇒ Object

Called when the lexer matches the : at the beginning of a Symbol.

Parameters:

  • token (String)

    The token that the lexer matched.



431
432
433
434
# File 'lib/tailor/lexer.rb', line 431

def on_symbeg(token)
  log "SYMBEG: '#{token}'"
  super(token)
end

#on_tlambda(token) ⇒ Object

Called when the lexer matches the -> as a lambda.

Parameters:

  • token (String)

    The token that the lexer matched.



439
440
441
442
# File 'lib/tailor/lexer.rb', line 439

def on_tlambda(token)
  log "TLAMBDA: '#{token}'"
  super(token)
end

#on_tlambeg(token) ⇒ Object

Called when the lexer matches the { that represents the beginning of a -> lambda.

Parameters:

  • token (String)

    The token that the lexer matched.



448
449
450
451
# File 'lib/tailor/lexer.rb', line 448

def on_tlambeg(token)
  log "TLAMBEG: '#{token}'"
  super(token)
end

#on_tstring_beg(token) ⇒ Object

Called when the lexer matches the beginning of a String.

Parameters:

  • token (String)

    The token that the lexer matched.



456
457
458
459
460
461
462
# File 'lib/tailor/lexer.rb', line 456

def on_tstring_beg(token)
  log "TSTRING_BEG: '#{token}'"
  current_line = LexedLine.new(super, lineno)
  tstring_beg_changed
  notify_tstring_beg_observers(current_line, lineno)
  super(token)
end

#on_tstring_content(token) ⇒ Object

Called when the lexer matches the content of any String.

Parameters:

  • token (String)

    The token that the lexer matched.



467
468
469
470
# File 'lib/tailor/lexer.rb', line 467

def on_tstring_content(token)
  log "TSTRING_CONTENT: '#{token}'"
  super(token)
end

#on_tstring_end(token) ⇒ Object

Called when the lexer matches the end of a String.

Parameters:

  • token (String)

    The token that the lexer matched.



475
476
477
478
479
480
# File 'lib/tailor/lexer.rb', line 475

def on_tstring_end(token)
  log "TSTRING_END: '#{token}'"
  tstring_end_changed
  notify_tstring_end_observers(lineno)
  super(token)
end

#on_words_beg(token) ⇒ Object

Called when the lexer matches ‘%W’.

Parameters:

  • token (String)

    The token that the lexer matched.



485
486
487
488
# File 'lib/tailor/lexer.rb', line 485

def on_words_beg(token)
  log "WORDS_BEG: '#{token}'"
  super(token)
end

#on_words_sep(token) ⇒ Object

Called when the lexer matches the separators in a %w or %W (by default, this is a single space).

Parameters:

  • token (String)

    The token that the lexer matched.



494
495
496
497
# File 'lib/tailor/lexer.rb', line 494

def on_words_sep(token)
  log "WORDS_SEP: '#{token}'"
  super(token)
end