Class: Rouge::Lexer Abstract
- Inherits:
-
Object
- Object
- Rouge::Lexer
- Includes:
- Token::Tokens
- Defined in:
- lib/rouge/lexer.rb
Overview
A lexer transforms text into a stream of ‘[token, chunk]` pairs.
Direct Known Subclasses
Defined Under Namespace
Classes: AmbiguousGuess
Constant Summary
Constants included from Token::Tokens
Token::Tokens::Num, Token::Tokens::Str
Class Method Summary collapse
-
.aliases(*args) ⇒ Object
Used to specify alternate names this lexer class may be found by.
-
.all ⇒ Object
A list of all lexers.
-
.analyze_text(text) ⇒ Object
abstract
Return a number between 0 and 1 indicating the likelihood that the text given should be lexed with this lexer.
- .assert_utf8!(str) ⇒ Object
- .default_options(o = {}) ⇒ Object
-
.demo(arg = :absent) ⇒ Object
Specify or get a small demo string for this lexer.
-
.demo_file(arg = :absent) ⇒ Object
Specify or get the path name containing a small demo for this lexer (can be overriden by Lexer.demo).
-
.desc(arg = :absent) ⇒ Object
Specify or get this lexer’s description.
-
.filenames(*fnames) ⇒ Object
Specify a list of filename globs associated with this lexer.
-
.find(name) ⇒ Object
Given a string, return the correct lexer class.
-
.find_fancy(str, code = nil) ⇒ Object
Find a lexer, with fancy shiny features.
-
.guess(info = {}) ⇒ Object
Guess which lexer to use based on a hash of info.
- .guess_by_filename(fname) ⇒ Object
- .guess_by_mimetype(mt) ⇒ Object
- .guess_by_source(source) ⇒ Object
-
.guesses(info = {}) ⇒ Object
Guess which lexer to use based on a hash of info.
-
.lex(stream, opts = {}, &b) ⇒ Object
Lexes ‘stream` with the given options.
-
.mimetypes(*mts) ⇒ Object
Specify a list of mimetypes associated with this lexer.
-
.tag(t = nil) ⇒ Object
Used to specify or get the canonical name of this lexer class.
-
.title(t = nil) ⇒ Object
Specify or get this lexer’s title.
Instance Method Summary collapse
- #debug ⇒ Object deprecated Deprecated.
-
#initialize(opts = {}) ⇒ Lexer
constructor
Create a new lexer with the given options.
-
#lex(string, opts = {}, &b) ⇒ Object
Given a string, yield [token, chunk] pairs.
-
#option(k, v = :absent) ⇒ Object
get or specify one option for this lexer.
-
#options(o = {}) ⇒ Object
get and/or specify the options for this lexer.
-
#reset! ⇒ Object
abstract
Called after each lex is finished.
-
#stream_tokens(stream, &b) ⇒ Object
abstract
Yield ‘[token, chunk]` pairs, given a prepared input stream.
-
#tag ⇒ Object
delegated to Lexer.tag.
Methods included from Token::Tokens
Constructor Details
#initialize(opts = {}) ⇒ Lexer
Create a new lexer with the given options. Individual lexers may specify extra options. The only current globally accepted option is ‘:debug`.
255 256 257 258 259 |
# File 'lib/rouge/lexer.rb', line 255 def initialize(opts={}) (opts) @debug = option(:debug) end |
Class Method Details
.aliases(*args) ⇒ Object
Used to specify alternate names this lexer class may be found by.
203 204 205 206 207 |
# File 'lib/rouge/lexer.rb', line 203 def aliases(*args) args.map!(&:to_s) args.each { |arg| Lexer.register(arg, self) } (@aliases ||= []).concat(args) end |
.all ⇒ Object
Returns a list of all lexers.
101 102 103 |
# File 'lib/rouge/lexer.rb', line 101 def all registry.values.uniq end |
.analyze_text(text) ⇒ Object
Return a number between 0 and 1 indicating the likelihood that the text given should be lexed with this lexer. The default implementation returns 0. Values under 0.5 will only be used to disambiguate filename or mimetype matches.
358 359 360 |
# File 'lib/rouge/lexer.rb', line 358 def self.analyze_text(text) 0 end |
.assert_utf8!(str) ⇒ Object
230 231 232 233 234 235 236 |
# File 'lib/rouge/lexer.rb', line 230 def assert_utf8!(str) return if %w(US-ASCII UTF-8 ASCII-8BIT).include? str.encoding.name raise EncodingError.new( "Bad encoding: #{str.encoding.names.join(',')}. " + "Please convert your string to UTF-8." ) end |
.default_options(o = {}) ⇒ Object
23 24 25 26 27 |
# File 'lib/rouge/lexer.rb', line 23 def (o={}) @default_options ||= {} @default_options.merge!(o) @default_options end |
.demo(arg = :absent) ⇒ Object
Specify or get a small demo string for this lexer
94 95 96 97 98 |
# File 'lib/rouge/lexer.rb', line 94 def demo(arg=:absent) return @demo = arg unless arg == :absent @demo = File.read(demo_file, encoding: 'utf-8') end |
.demo_file(arg = :absent) ⇒ Object
Specify or get the path name containing a small demo for this lexer (can be overriden by demo).
87 88 89 90 91 |
# File 'lib/rouge/lexer.rb', line 87 def demo_file(arg=:absent) return @demo_file = Pathname.new(arg) unless arg == :absent @demo_file = Pathname.new(__FILE__).dirname.join('demos', tag) end |
.desc(arg = :absent) ⇒ Object
Specify or get this lexer’s description.
77 78 79 80 81 82 83 |
# File 'lib/rouge/lexer.rb', line 77 def desc(arg=:absent) if arg == :absent @desc else @desc = arg end end |
.filenames(*fnames) ⇒ Object
Specify a list of filename globs associated with this lexer.
215 216 217 |
# File 'lib/rouge/lexer.rb', line 215 def filenames(*fnames) (@filenames ||= []).concat(fnames) end |
.find(name) ⇒ Object
Given a string, return the correct lexer class.
30 31 32 |
# File 'lib/rouge/lexer.rb', line 30 def find(name) registry[name.to_s] end |
.find_fancy(str, code = nil) ⇒ Object
Find a lexer, with fancy shiny features.
-
The string you pass can include CGI-style options
Lexer.find_fancy('erb?parent=tex')
-
You can pass the special name ‘guess’ so we guess for you, and you can pass a second argument of the code to guess by
Lexer.find_fancy('guess', "#!/bin/bash\necho Hello, world")
This is used in the Redcarpet plugin as well as Rouge’s own markdown lexer for highlighting internal code blocks.
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
# File 'lib/rouge/lexer.rb', line 48 def find_fancy(str, code=nil) name, opts = str ? str.split('?', 2) : [nil, ''] # parse the options hash from a cgi-style string opts = CGI.parse(opts || '').map do |k, vals| [ k.to_sym, vals.empty? ? true : vals[0] ] end opts = Hash[opts] lexer_class = case name when 'guess', nil self.guess(:source => code, :mimetype => opts[:mimetype]) when String self.find(name) end lexer_class && lexer_class.new(opts) end |
.guess(info = {}) ⇒ Object
Guess which lexer to use based on a hash of info.
147 148 149 150 151 152 153 154 |
# File 'lib/rouge/lexer.rb', line 147 def guess(info={}) lexers = guesses(info) return Lexers::PlainText if lexers.empty? return lexers[0] if lexers.size == 1 raise AmbiguousGuess.new(lexers) end |
.guess_by_filename(fname) ⇒ Object
160 161 162 |
# File 'lib/rouge/lexer.rb', line 160 def guess_by_filename(fname) guess :filename => fname end |
.guess_by_mimetype(mt) ⇒ Object
156 157 158 |
# File 'lib/rouge/lexer.rb', line 156 def guess_by_mimetype(mt) guess :mimetype => mt end |
.guess_by_source(source) ⇒ Object
164 165 166 |
# File 'lib/rouge/lexer.rb', line 164 def guess_by_source(source) guess :source => source end |
.guesses(info = {}) ⇒ Object
Guess which lexer to use based on a hash of info.
This accepts the same arguments as Lexer.guess, but will never throw an error. It will return a (possibly empty) list of potential lexers to use.
110 111 112 113 114 115 116 117 118 119 120 121 122 123 |
# File 'lib/rouge/lexer.rb', line 110 def guesses(info={}) mimetype, filename, source = info.values_at(:mimetype, :filename, :source) custom_globs = info[:custom_globs] guessers = (info[:guessers] || []).dup guessers << Guessers::Mimetype.new(mimetype) if mimetype guessers << Guessers::GlobMapping.by_pairs(custom_globs, filename) if custom_globs && filename guessers << Guessers::Filename.new(filename) if filename guessers << Guessers::Modeline.new(source) if source guessers << Guessers::Source.new(source) if source Guesser.guess(guessers, Lexer.all) end |
.lex(stream, opts = {}, &b) ⇒ Object
Lexes ‘stream` with the given options. The lex is delegated to a new instance.
19 20 21 |
# File 'lib/rouge/lexer.rb', line 19 def lex(stream, opts={}, &b) new(opts).lex(stream, &b) end |
.mimetypes(*mts) ⇒ Object
Specify a list of mimetypes associated with this lexer.
225 226 227 |
# File 'lib/rouge/lexer.rb', line 225 def mimetypes(*mts) (@mimetypes ||= []).concat(mts) end |
.tag(t = nil) ⇒ Object
Used to specify or get the canonical name of this lexer class.
187 188 189 190 191 192 |
# File 'lib/rouge/lexer.rb', line 187 def tag(t=nil) return @tag if t.nil? @tag = t.to_s Lexer.register(@tag, self) end |
.title(t = nil) ⇒ Object
Specify or get this lexer’s title. Meant to be human-readable.
69 70 71 72 73 74 |
# File 'lib/rouge/lexer.rb', line 69 def title(t=nil) if t.nil? t = tag.capitalize end @title ||= t end |
Instance Method Details
#debug ⇒ Object
Instead of ‘debug { “foo” }`, simply `puts “foo” if @debug`.
Leave a debug message if the ‘:debug` option is set. The message is given as a block because some debug messages contain calculated information that is unnecessary for lexing in the real world.
Calls to this method should be guarded with “if @debug” for best performance when debugging is turned off.
289 290 291 292 |
# File 'lib/rouge/lexer.rb', line 289 def debug warn "Lexer#debug is deprecated. Simply puts if @debug instead." puts yield if @debug end |
#lex(string, opts = {}, &b) ⇒ Object
Given a string, yield [token, chunk] pairs. If no block is given, an enumerator is returned.
306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 |
# File 'lib/rouge/lexer.rb', line 306 def lex(string, opts={}, &b) return enum_for(:lex, string, opts) unless block_given? Lexer.assert_utf8!(string) reset! unless opts[:continue] # consolidate consecutive tokens of the same type last_token = nil last_val = nil stream_tokens(string) do |tok, val| next if val.empty? if tok == last_token last_val << val next end b.call(last_token, last_val) if last_token last_token = tok last_val = val end b.call(last_token, last_val) if last_token end |
#option(k, v = :absent) ⇒ Object
get or specify one option for this lexer
269 270 271 272 273 274 275 |
# File 'lib/rouge/lexer.rb', line 269 def option(k, v=:absent) if v == :absent [k] else ({ k => v }) end end |
#options(o = {}) ⇒ Object
get and/or specify the options for this lexer.
262 263 264 265 266 |
# File 'lib/rouge/lexer.rb', line 262 def (o={}) (@options ||= {}).merge!(o) self.class..merge(@options) end |
#reset! ⇒ Object
Called after each lex is finished. The default implementation is a noop.
298 299 |
# File 'lib/rouge/lexer.rb', line 298 def reset! end |
#stream_tokens(stream, &b) ⇒ Object
Yield ‘[token, chunk]` pairs, given a prepared input stream. This must be implemented.
344 345 346 |
# File 'lib/rouge/lexer.rb', line 344 def stream_tokens(stream, &b) raise 'abstract' end |
#tag ⇒ Object
delegated to tag
333 334 335 |
# File 'lib/rouge/lexer.rb', line 333 def tag self.class.tag end |