Class: Rouge::Lexer Abstract
- Inherits:
-
Object
- Object
- Rouge::Lexer
- Includes:
- Token::Tokens
- Defined in:
- lib/rouge/lexer.rb
Overview
A lexer transforms text into a stream of ‘[token, chunk]` pairs.
Direct Known Subclasses
Rouge::Lexers::ConsoleLexer, Rouge::Lexers::PlainText, RegexLexer
Constant Summary
Constants included from Token::Tokens
Token::Tokens::Num, Token::Tokens::Str
Instance Attribute Summary collapse
-
#options ⇒ Object
readonly
-*- instance methods -*- #.
Class Method Summary collapse
-
.aliases(*args) ⇒ Object
Used to specify alternate names this lexer class may be found by.
-
.all ⇒ Object
A list of all lexers.
- .assert_utf8!(str) ⇒ Object
- .debug_enabled? ⇒ Boolean
-
.demo(arg = :absent) ⇒ Object
Specify or get a small demo string for this lexer.
-
.demo_file(arg = :absent) ⇒ Object
Specify or get the path name containing a small demo for this lexer (can be overriden by Lexer.demo).
-
.desc(arg = :absent) ⇒ Object
Specify or get this lexer’s description.
-
.detect?(text) ⇒ Boolean
abstract
Return true if there is an in-text indication (such as a shebang or DOCTYPE declaration) that this lexer should be used.
- .disable_debug! ⇒ Object
- .enable_debug! ⇒ Object
-
.filenames(*fnames) ⇒ Object
Specify a list of filename globs associated with this lexer.
-
.find(name) ⇒ Class<Rouge::Lexer>?
Given a name in string, return the correct lexer class.
-
.find_fancy(str, code = nil, additional_options = {}) ⇒ Object
Find a lexer, with fancy shiny features.
-
.guess(info = {}, &fallback) ⇒ Class<Rouge::Lexer>
Guess which lexer to use based on a hash of info.
- .guess_by_filename(fname) ⇒ Object
- .guess_by_mimetype(mt) ⇒ Object
- .guess_by_source(source) ⇒ Object
-
.guesses(info = {}) ⇒ Object
Guess which lexer to use based on a hash of info.
-
.lex(stream, opts = {}, &b) ⇒ Object
Lexes ‘stream` with the given options.
-
.mimetypes(*mts) ⇒ Object
Specify a list of mimetypes associated with this lexer.
- .option(name, desc) ⇒ Object
- .option_docs ⇒ Object
-
.tag(t = nil) ⇒ Object
Used to specify or get the canonical name of this lexer class.
-
.title(t = nil) ⇒ Object
Specify or get this lexer’s title.
Instance Method Summary collapse
- #as_bool(val) ⇒ Object
- #as_lexer(val) ⇒ Object
- #as_list(val) ⇒ Object
- #as_string(val) ⇒ Object
- #as_token(val) ⇒ Object
- #bool_option(name, &default) ⇒ Object
- #hash_option(name, defaults, &val_cast) ⇒ Object
-
#initialize(opts = {}) ⇒ Lexer
constructor
Create a new lexer with the given options.
-
#lex(string, opts = {}, &b) ⇒ Object
Given a string, yield [token, chunk] pairs.
- #lexer_option(name, &default) ⇒ Object
- #list_option(name, &default) ⇒ Object
-
#reset! ⇒ Object
abstract
Called after each lex is finished.
-
#stream_tokens(stream, &b) ⇒ Object
abstract
Yield ‘[token, chunk]` pairs, given a prepared input stream.
- #string_option(name, &default) ⇒ Object
-
#tag ⇒ Object
delegated to Lexer.tag.
- #token_option(name, &default) ⇒ Object
Methods included from Token::Tokens
Constructor Details
#initialize(opts = {}) ⇒ Lexer
Create a new lexer with the given options. Individual lexers may specify extra options. The only current globally accepted option is ‘:debug`.
284 285 286 287 288 289 |
# File 'lib/rouge/lexer.rb', line 284 def initialize(opts={}) @options = {} opts.each { |k, v| @options[k.to_s] = v } @debug = Lexer.debug_enabled? && bool_option(:debug) end |
Instance Attribute Details
#options ⇒ Object (readonly)
-*- instance methods -*- #
274 275 276 |
# File 'lib/rouge/lexer.rb', line 274 def @options end |
Class Method Details
.aliases(*args) ⇒ Object
Used to specify alternate names this lexer class may be found by.
231 232 233 234 235 |
# File 'lib/rouge/lexer.rb', line 231 def aliases(*args) args.map!(&:to_s) args.each { |arg| Lexer.register(arg, self) } (@aliases ||= []).concat(args) end |
.all ⇒ Object
Returns a list of all lexers.
120 121 122 |
# File 'lib/rouge/lexer.rb', line 120 def all registry.values.uniq end |
.assert_utf8!(str) ⇒ Object
258 259 260 261 262 263 264 |
# File 'lib/rouge/lexer.rb', line 258 def assert_utf8!(str) return if %w(US-ASCII UTF-8 ASCII-8BIT).include? str.encoding.name raise EncodingError.new( "Bad encoding: #{str.encoding.names.join(',')}. " + "Please convert your string to UTF-8." ) end |
.debug_enabled? ⇒ Boolean
194 195 196 |
# File 'lib/rouge/lexer.rb', line 194 def debug_enabled? !!@debug_enabled end |
.demo(arg = :absent) ⇒ Object
Specify or get a small demo string for this lexer
113 114 115 116 117 |
# File 'lib/rouge/lexer.rb', line 113 def demo(arg=:absent) return @demo = arg unless arg == :absent @demo = File.read(demo_file, mode: 'rt:bom|utf-8') end |
.demo_file(arg = :absent) ⇒ Object
Specify or get the path name containing a small demo for this lexer (can be overriden by demo).
106 107 108 109 110 |
# File 'lib/rouge/lexer.rb', line 106 def demo_file(arg=:absent) return @demo_file = Pathname.new(arg) unless arg == :absent @demo_file = Pathname.new(__FILE__).dirname.join('demos', tag) end |
.desc(arg = :absent) ⇒ Object
Specify or get this lexer’s description.
88 89 90 91 92 93 94 |
# File 'lib/rouge/lexer.rb', line 88 def desc(arg=:absent) if arg == :absent @desc else @desc = arg end end |
.detect?(text) ⇒ Boolean
Return true if there is an in-text indication (such as a shebang or DOCTYPE declaration) that this lexer should be used.
446 447 448 |
# File 'lib/rouge/lexer.rb', line 446 def self.detect?(text) false end |
.disable_debug! ⇒ Object
190 191 192 |
# File 'lib/rouge/lexer.rb', line 190 def disable_debug! @debug_enabled = false end |
.enable_debug! ⇒ Object
186 187 188 |
# File 'lib/rouge/lexer.rb', line 186 def enable_debug! @debug_enabled = true end |
.filenames(*fnames) ⇒ Object
Specify a list of filename globs associated with this lexer.
243 244 245 |
# File 'lib/rouge/lexer.rb', line 243 def filenames(*fnames) (@filenames ||= []).concat(fnames) end |
.find(name) ⇒ Class<Rouge::Lexer>?
Given a name in string, return the correct lexer class.
29 30 31 |
# File 'lib/rouge/lexer.rb', line 29 def find(name) registry[name.to_s] end |
.find_fancy(str, code = nil, additional_options = {}) ⇒ Object
Find a lexer, with fancy shiny features.
-
The string you pass can include CGI-style options
Lexer.find_fancy('erb?parent=tex')
-
You can pass the special name ‘guess’ so we guess for you, and you can pass a second argument of the code to guess by
Lexer.find_fancy('guess', "#!/bin/bash\necho Hello, world")
This is used in the Redcarpet plugin as well as Rouge’s own markdown lexer for highlighting internal code blocks.
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
# File 'lib/rouge/lexer.rb', line 47 def find_fancy(str, code=nil, ={}) if str && !str.include?('?') && str != 'guess' lexer_class = find(str) return lexer_class && lexer_class.new() end name, opts = str ? str.split('?', 2) : [nil, ''] # parse the options hash from a cgi-style string opts = CGI.parse(opts || '').map do |k, vals| val = case vals.size when 0 then true when 1 then vals[0] else vals end [ k.to_s, val ] end opts = .merge(Hash[opts]) lexer_class = case name when 'guess', nil self.guess(:source => code, :mimetype => opts['mimetype']) when String self.find(name) end lexer_class && lexer_class.new(opts) end |
.guess(info = {}, &fallback) ⇒ Class<Rouge::Lexer>
Guess which lexer to use based on a hash of info.
161 162 163 164 165 166 167 168 169 170 171 172 |
# File 'lib/rouge/lexer.rb', line 161 def guess(info={}, &fallback) lexers = guesses(info) return Lexers::PlainText if lexers.empty? return lexers[0] if lexers.size == 1 if fallback fallback.call(lexers) else raise Guesser::Ambiguous.new(lexers) end end |
.guess_by_filename(fname) ⇒ Object
178 179 180 |
# File 'lib/rouge/lexer.rb', line 178 def guess_by_filename(fname) guess :filename => fname end |
.guess_by_mimetype(mt) ⇒ Object
174 175 176 |
# File 'lib/rouge/lexer.rb', line 174 def guess_by_mimetype(mt) guess :mimetype => mt end |
.guess_by_source(source) ⇒ Object
182 183 184 |
# File 'lib/rouge/lexer.rb', line 182 def guess_by_source(source) guess :source => source end |
.guesses(info = {}) ⇒ Object
Guess which lexer to use based on a hash of info.
This accepts the same arguments as Lexer.guess, but will never throw an error. It will return a (possibly empty) list of potential lexers to use.
129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 |
# File 'lib/rouge/lexer.rb', line 129 def guesses(info={}) mimetype, filename, source = info.values_at(:mimetype, :filename, :source) custom_globs = info[:custom_globs] guessers = (info[:guessers] || []).dup guessers << Guessers::Mimetype.new(mimetype) if mimetype guessers << Guessers::GlobMapping.by_pairs(custom_globs, filename) if custom_globs && filename guessers << Guessers::Filename.new(filename) if filename guessers << Guessers::Modeline.new(source) if source guessers << Guessers::Source.new(source) if source guessers << Guessers::Disambiguation.new(filename, source) if source && filename Guesser.guess(guessers, Lexer.all) end |
.lex(stream, opts = {}, &b) ⇒ Object
Lexes ‘stream` with the given options. The lex is delegated to a new instance.
22 23 24 |
# File 'lib/rouge/lexer.rb', line 22 def lex(stream, opts={}, &b) new(opts).lex(stream, &b) end |
.mimetypes(*mts) ⇒ Object
Specify a list of mimetypes associated with this lexer.
253 254 255 |
# File 'lib/rouge/lexer.rb', line 253 def mimetypes(*mts) (@mimetypes ||= []).concat(mts) end |
.option(name, desc) ⇒ Object
100 101 102 |
# File 'lib/rouge/lexer.rb', line 100 def option(name, desc) option_docs[name.to_s] = desc end |
.option_docs ⇒ Object
96 97 98 |
# File 'lib/rouge/lexer.rb', line 96 def option_docs @option_docs ||= InheritableHash.new(superclass.option_docs) end |
.tag(t = nil) ⇒ Object
Used to specify or get the canonical name of this lexer class.
215 216 217 218 219 220 |
# File 'lib/rouge/lexer.rb', line 215 def tag(t=nil) return @tag if t.nil? @tag = t.to_s Lexer.register(@tag, self) end |
.title(t = nil) ⇒ Object
Specify or get this lexer’s title. Meant to be human-readable.
80 81 82 83 84 85 |
# File 'lib/rouge/lexer.rb', line 80 def title(t=nil) if t.nil? t = tag.capitalize end @title ||= t end |
Instance Method Details
#as_bool(val) ⇒ Object
291 292 293 294 295 296 297 298 299 300 |
# File 'lib/rouge/lexer.rb', line 291 def as_bool(val) case val when nil, false, 0, '0', 'off' false when Array val.empty? ? true : as_bool(val.last) else true end end |
#as_lexer(val) ⇒ Object
319 320 321 322 323 324 325 326 327 328 329 330 |
# File 'lib/rouge/lexer.rb', line 319 def as_lexer(val) return as_lexer(val.last) if val.is_a?(Array) return val.new(@options) if val.is_a?(Class) && val < Lexer case val when Lexer val when String lexer_class = Lexer.find(val) lexer_class && lexer_class.new(@options) end end |
#as_list(val) ⇒ Object
308 309 310 311 312 313 314 315 316 317 |
# File 'lib/rouge/lexer.rb', line 308 def as_list(val) case val when Array val.flat_map { |v| as_list(v) } when String val.split(',') else [] end end |
#as_string(val) ⇒ Object
302 303 304 305 306 |
# File 'lib/rouge/lexer.rb', line 302 def as_string(val) return as_string(val.last) if val.is_a?(Array) val ? val.to_s : nil end |
#as_token(val) ⇒ Object
332 333 334 335 336 337 338 339 340 |
# File 'lib/rouge/lexer.rb', line 332 def as_token(val) return as_token(val.last) if val.is_a?(Array) case val when Token val else Token[val] end end |
#bool_option(name, &default) ⇒ Object
342 343 344 345 346 347 348 |
# File 'lib/rouge/lexer.rb', line 342 def bool_option(name, &default) if @options.key?(name.to_s) as_bool(@options[name.to_s]) else default ? default.call : false end end |
#hash_option(name, defaults, &val_cast) ⇒ Object
366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 |
# File 'lib/rouge/lexer.rb', line 366 def hash_option(name, defaults, &val_cast) name = name.to_s out = defaults.dup base = @options.delete(name.to_s) base = {} unless base.is_a?(Hash) base.each { |k, v| out[k.to_s] = val_cast ? val_cast.call(v) : v } @options.keys.each do |key| next unless key =~ /(\w+)\[(\w+)\]/ and $1 == name value = @options.delete(key) out[$2] = val_cast ? val_cast.call(value) : value end out end |
#lex(string, opts = {}, &b) ⇒ Object
Given a string, yield [token, chunk] pairs. If no block is given, an enumerator is returned.
396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 |
# File 'lib/rouge/lexer.rb', line 396 def lex(string, opts={}, &b) return enum_for(:lex, string, opts) unless block_given? Lexer.assert_utf8!(string) reset! unless opts[:continue] # consolidate consecutive tokens of the same type last_token = nil last_val = nil stream_tokens(string) do |tok, val| next if val.empty? if tok == last_token last_val << val next end b.call(last_token, last_val) if last_token last_token = tok last_val = val end b.call(last_token, last_val) if last_token end |
#lexer_option(name, &default) ⇒ Object
354 355 356 |
# File 'lib/rouge/lexer.rb', line 354 def lexer_option(name, &default) as_lexer(@options.delete(name.to_s, &default)) end |
#list_option(name, &default) ⇒ Object
358 359 360 |
# File 'lib/rouge/lexer.rb', line 358 def list_option(name, &default) as_list(@options.delete(name.to_s, &default)) end |
#reset! ⇒ Object
Called after each lex is finished. The default implementation is a noop.
388 389 |
# File 'lib/rouge/lexer.rb', line 388 def reset! end |
#stream_tokens(stream, &b) ⇒ Object
Yield ‘[token, chunk]` pairs, given a prepared input stream. This must be implemented.
434 435 436 |
# File 'lib/rouge/lexer.rb', line 434 def stream_tokens(stream, &b) raise 'abstract' end |
#string_option(name, &default) ⇒ Object
350 351 352 |
# File 'lib/rouge/lexer.rb', line 350 def string_option(name, &default) as_string(@options.delete(name.to_s, &default)) end |
#tag ⇒ Object
delegated to tag
423 424 425 |
# File 'lib/rouge/lexer.rb', line 423 def tag self.class.tag end |
#token_option(name, &default) ⇒ Object
362 363 364 |
# File 'lib/rouge/lexer.rb', line 362 def token_option(name, &default) as_token(@options.delete(name.to_s, &default)) end |