Class: CodeRay::Tokens

Inherits:
Array
  • Object
show all
Defined in:
lib/coderay/tokens.rb

Overview

Tokens TODO: Rewrite!

The Tokens class represents a list of tokens returnd from a Scanner.

A token is not a special object, just a two-element Array consisting of

  • the token text (the original source of the token in a String) or a token action (begin_group, end_group, begin_line, end_line)

  • the token kind (a Symbol representing the type of the token)

A token looks like this:

['# It looks like this', :comment]
['3.1415926', :float]
['$^', :error]

Some scanners also yield sub-tokens, represented by special token actions, namely begin_group and end_group.

The Ruby scanner, for example, splits “a string” into:

[
 [:begin_group, :string],
 ['"', :delimiter],
 ['a string', :content],
 ['"', :delimiter],
 [:end_group, :string]
]

Tokens is the interface between Scanners and Encoders: The input is split and saved into a Tokens object. The Encoder then builds the output from this object.

Thus, the syntax below becomes clear:

CodeRay.scan('price = 2.59', :ruby).html
# the Tokens object is here -------^

See how small it is? ;)

Tokens gives you the power to handle pre-scanned code very easily: You can convert it to a webpage, a YAML file, or dump it into a gzip’ed string that you put in your DB.

It also allows you to generate tokens directly (without using a scanner), to load them from a file, and still use any Encoder that CodeRay provides.

Defined Under Namespace

Modules: Undumping

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Dynamic Method Handling

This class handles dynamic methods through the method_missing method

#method_missing(meth, options = {}) ⇒ Object

Redirects unknown methods to encoder calls.

For example, if you call tokens.html, the HTML encoder is used to highlight the tokens.



80
81
82
83
84
# File 'lib/coderay/tokens.rb', line 80

def method_missing meth, options = {}
  encode meth, options
rescue PluginHost::PluginNotFound
  super
end

Instance Attribute Details

#scannerObject

The Scanner instance that created the tokens.



56
57
58
# File 'lib/coderay/tokens.rb', line 56

def scanner
  @scanner
end

Class Method Details

.load(dump) ⇒ Object

Undump the object using Marshal.load, then unzip it using GZip.gunzip.

The result is commonly a Tokens object, but this is not guaranteed.



201
202
203
204
# File 'lib/coderay/tokens.rb', line 201

def Tokens.load dump
  dump = GZip.gunzip dump
  @dump = Marshal.load dump
end

Instance Method Details

#begin_group(kind) ⇒ Object



207
# File 'lib/coderay/tokens.rb', line 207

def begin_group kind; push :begin_group, kind end

#begin_line(kind) ⇒ Object



209
# File 'lib/coderay/tokens.rb', line 209

def begin_line kind; push :begin_line, kind end

#countObject

Return the actual number of tokens.



181
182
183
# File 'lib/coderay/tokens.rb', line 181

def count
  size / 2
end

#dump(gzip_level = 7) ⇒ Object

Dumps the object into a String that can be saved in files or databases.

The dump is created with Marshal.dump; In addition, it is gzipped using GZip.gzip.

The returned String object includes Undumping so it has an #undump method. See Tokens.load.

You can configure the level of compression, but the default value 7 should be what you want in most cases as it is a good compromise between speed and compression rate.

See GZip module.



174
175
176
177
178
# File 'lib/coderay/tokens.rb', line 174

def dump gzip_level = 7
  dump = Marshal.dump self
  dump = GZip.gzip dump, gzip_level
  dump.extend Undumping
end

#encode(encoder, options = {}) ⇒ Object

Encode the tokens using encoder.

encoder can be

  • a symbol like :html oder :statistic

  • an Encoder class

  • an Encoder object

options are passed to the encoder.



66
67
68
69
# File 'lib/coderay/tokens.rb', line 66

def encode encoder, options = {}
  encoder = Encoders[encoder].new options if encoder.respond_to? :to_sym
  encoder.encode_tokens self, options
end

#end_group(kind) ⇒ Object



208
# File 'lib/coderay/tokens.rb', line 208

def end_group kind; push :end_group, kind end

#end_line(kind) ⇒ Object



210
# File 'lib/coderay/tokens.rb', line 210

def end_line kind; push :end_line, kind end

#split_into_parts(*sizes) ⇒ Object

Split the tokens into parts of the given sizes.

The result will be an Array of Tokens objects. The parts have the text size specified by the parameter. In addition, each part closes all opened tokens. This is useful to insert tokens betweem them.

This method is used by @Scanner#tokenize@ when called with an Array of source strings. The Diff encoder uses it for inline highlighting.



95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
# File 'lib/coderay/tokens.rb', line 95

def split_into_parts *sizes
  parts = []
  opened = []
  content = nil
  part = Tokens.new
  part_size = 0
  size = sizes.first
  i = 0
  for item in self
    case content
    when nil
      content = item
    when String
      if size && part_size + content.size > size  # token must be cut
        if part_size < size  # some part of the token goes into this part
          content = content.dup  # content may no be safe to change
          part << content.slice!(0, size - part_size) << item
        end
        # close all open groups and lines...
        closing = opened.reverse.flatten.map do |content_or_kind|
          case content_or_kind
          when :begin_group
            :end_group
          when :begin_line
            :end_line
          else
            content_or_kind
          end
        end
        part.concat closing
        begin
          parts << part
          part = Tokens.new
          size = sizes[i += 1]
        end until size.nil? || size > 0
        # ...and open them again.
        part.concat opened.flatten
        part_size = 0
        redo unless content.empty?
      else
        part << content << item
        part_size += content.size
      end
      content = nil
    when Symbol
      case content
      when :begin_group, :begin_line
        opened << [content, item]
      when :end_group, :end_line
        opened.pop
      else
        raise ArgumentError, 'Unknown token action: %p, kind = %p' % [content, item]
      end
      part << content << item
      content = nil
    else
      raise ArgumentError, 'Token input junk: %p, kind = %p' % [content, item]
    end
  end
  parts << part
  parts << Tokens.new while parts.size < sizes.size
  parts
end

#to_sObject

Turn tokens into a string by concatenating them.



72
73
74
# File 'lib/coderay/tokens.rb', line 72

def to_s
  encode CodeRay::Encoders::Encoder.new
end