Class: Rocco

Inherits:
Object
  • Object
show all
Defined in:
lib/rocco.rb,
lib/rocco/tasks.rb

Overview

Reopen the Rocco class and add a ‘make` class method. This is a simple bit of sugar over `Rocco::Task.new`. If you want your Rake task to be named something other than `:rocco`, you can use `Rocco::Task` directly.

Defined Under Namespace

Classes: Layout, Task

Constant Summary collapse

VERSION =
'0.6'
C_STYLE_COMMENTS =

Given a file’s language, we should be able to autopopulate the ‘comment_chars` variables for single-line comments. If we don’t have comment characters on record for a given language, we’ll use the user-provided ‘:comment_char` option (which defaults to `#`).

Comment characters are listed as:

{ :single       => "//",
  :multi_start  => "/**",
  :multi_middle => "*",
  :multi_end    => "*/" }

‘:single` denotes the leading character of a single-line comment. `:multi_start` denotes the string that should appear alone on a line of code to begin a block of documentation. `:multi_middle` denotes the leading character of block comment content, and `:multi_end` is the string that ought appear alone on a line to close a block of documentation. That is:

/**                 [:multi][:start]
 *                  [:multi][:middle]
 ...
 *                  [:multi][:middle]
 */                 [:multi][:end]

If a language only has one type of comment, the missing type should be assigned ‘nil`.

At the moment, we’re only returning ‘:single`. Consider this groundwork for block comment parsing.

{
  :single => "//",
  :multi  => { :start => "/**", :middle => "*", :end => "*/" },
  :heredoc => nil
}
COMMENT_STYLES =
{
  "bash"          =>  { :single => "#", :multi => nil },
  "c"             =>  C_STYLE_COMMENTS,
  "coffee-script" =>  {
    :single => "#",
    :multi  => { :start => "###", :middle => nil, :end => "###" },
    :heredoc => nil
  },
  "cpp" =>  C_STYLE_COMMENTS,
  "csharp" => C_STYLE_COMMENTS,
  "css"           =>  {
    :single => nil,
    :multi  => { :start => "/**", :middle => "*", :end => "*/" },
    :heredoc => nil
  },
  "html"           =>  {
    :single => nil,
    :multi => { :start => '<!--', :middle => nil, :end => '-->' },
    :heredoc => nil
  },
  "java"          =>  C_STYLE_COMMENTS,
  "js"            =>  C_STYLE_COMMENTS,
  "lua"           =>  {
    :single => "--",
    :multi => nil,
    :heredoc => nil
  },
  "php" => C_STYLE_COMMENTS,
  "python"        =>  {
    :single => "#",
    :multi  => { :start => '"""', :middle => nil, :end => '"""' },
    :heredoc => nil
  },
  "rb"            =>  {
    :single => "#",
    :multi  => { :start => '=begin', :middle => nil, :end => '=end' },
    :heredoc => "<<-"
  },
  "scheme"        =>  { :single => ";;",  :multi => nil, :heredoc => nil },
  "xml"           =>  {
    :single => nil,
    :multi => { :start => '<!--', :middle => nil, :end => '-->' },
    :heredoc => nil
  },
}

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(filename, sources = [], options = {}, &block) ⇒ Rocco

Returns a new instance of Rocco.



78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
# File 'lib/rocco.rb', line 78

def initialize(filename, sources=[], options={}, &block)
  @file       = filename
  @sources    = sources

  # When `block` is given, it must read the contents of the file using
  # whatever means necessary and return it as a string. With no `block`,
  # the file is read to retrieve data.
  @data =
    if block_given?
      yield
    else
      File.read(filename)
    end

  defaults = {
    :language      => 'ruby',
    :comment_chars => '#',
    :template_file => nil
  }
  @options = defaults.merge(options)

  # If we detect a language
  if detect_language() != "text"
    # then assign the detected language to `:language`, and look for
    # comment characters based on that language
    @options[:language] = detect_language()
    @options[:comment_chars] = generate_comment_chars()

  # If we didn't detect a language, but the user provided one, use it
  # to look around for comment characters to override the default.
  elsif @options[:language] != defaults[:language]
    @options[:comment_chars] = generate_comment_chars()

  # If neither is true, then convert the default comment character string
  # into the comment_char syntax (we'll discuss that syntax in detail when
  # we get to `generate_comment_chars()` in a moment.
  else
    @options[:comment_chars] = {
      :single => @options[:comment_chars],
      :multi => nil
    }
  end

  # Turn `:comment_chars` into a regex matching a series of spaces, the
  # `:comment_chars` string, and the an optional space.  We'll use that
  # to detect single-line comments.
  @comment_pattern =
    Regexp.new("^\\s*#{@options[:comment_chars][:single]}\s?")

  # `parse()` the file contents stored in `@data`.  Run the result through
  # `split()` and that result through `highlight()` to generate the final
  # section list.
  @sections = highlight(split(parse(@data)))
end

Instance Attribute Details

#fileObject (readonly)

The filename as given to ‘Rocco.new`.



134
135
136
# File 'lib/rocco.rb', line 134

def file
  @file
end

#optionsObject (readonly)

The merged options array



137
138
139
# File 'lib/rocco.rb', line 137

def options
  @options
end

#sectionsObject (readonly)

A list of two-tuples representing each section of the source file. Each item in the list has the form: ‘[docs_html, code_html]`, where both elements are strings containing the documentation and source code HTML, respectively.



143
144
145
# File 'lib/rocco.rb', line 143

def sections
  @sections
end

#sourcesObject (readonly)

A list of all source filenames included in the documentation set. Useful for building an index of other files.



147
148
149
# File 'lib/rocco.rb', line 147

def sources
  @sources
end

Class Method Details

.make(dest = 'docs/', source_files = 'lib/**/*.rb', options = {}) ⇒ Object



54
55
56
# File 'lib/rocco/tasks.rb', line 54

def self.make(dest='docs/', source_files='lib/**/*.rb', options={})
  Task.new(:rocco, dest, source_files, options)
end

Instance Method Details

#detect_languageObject

If ‘pygmentize` is available, we can use it to autodetect a file’s language based on its filename. Filenames without extensions, or with extensions that ‘pygmentize` doesn’t understand will return ‘text`. We’ll also return ‘text` if `pygmentize` isn’t available.

We’ll memoize the result, as we’ll call this a few times.



170
171
172
173
174
175
176
177
# File 'lib/rocco.rb', line 170

def detect_language
  @_language ||=
    if pygmentize?
      %x[pygmentize -N #{@file}].strip.split('+').first
    else
      "text"
    end
end

#docblock(docs) ⇒ Object

Take a list of block comments and convert Docblock @annotations to Markdown syntax.



418
419
420
421
422
423
424
# File 'lib/rocco.rb', line 418

def docblock(docs)
  docs.map do |doc|
    doc.split("\n").map do |line|
      line.match(/^@\w+/) ? line.sub(/^@(\w+)\s+/, '> **\1** ')+"  " : line
    end.join("\n")
  end
end

#generate_comment_charsObject



261
262
263
264
265
266
267
268
# File 'lib/rocco.rb', line 261

def generate_comment_chars
  @_commentchar ||=
    if COMMENT_STYLES[@options[:language]]
      COMMENT_STYLES[@options[:language]]
    else
      { :single => @options[:comment_chars], :multi => nil, :heredoc => nil }
    end
end

#highlight(blocks) ⇒ Object

Take the result of ‘split` and apply Markdown formatting to comments and syntax highlighting to source code.



428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
# File 'lib/rocco.rb', line 428

def highlight(blocks)
  docs_blocks, code_blocks = blocks

  # Pre-process Docblock @annotations.
  if @options[:docblocks]
    docs_blocks = docblock(docs_blocks)
  end

  # Combine all docs blocks into a single big markdown document with section
  # dividers and run through the Markdown processor. Then split it back out
  # into separate sections.
  markdown = docs_blocks.join("\n\n##### DIVIDER\n\n")
  docs_html = Markdown.new(markdown, :smart).
    to_html.
    split(/\n*<h5>DIVIDER<\/h5>\n*/m)

  # Combine all code blocks into a single big stream with section dividers and
  # run through either `pygmentize(1)` or <http://pygments.appspot.com>
  span, espan = '<span class="c.?">', '</span>'
  if @options[:comment_chars][:single]
    front = @options[:comment_chars][:single]
    divider_input  = "\n\n#{front} DIVIDER\n\n"
    divider_output = Regexp.new(
      [ "\\n*",
        span,
        Regexp.escape(CGI.escapeHTML(front)),
        ' DIVIDER',
        espan,
        "\\n*"
      ].join, Regexp::MULTILINE
    )
  else
    front = @options[:comment_chars][:multi][:start]
    back  = @options[:comment_chars][:multi][:end]
    divider_input  = "\n\n#{front}\nDIVIDER\n#{back}\n\n"
    divider_output = Regexp.new(
      [ "\\n*",
        span, Regexp.escape(CGI.escapeHTML(front)), espan,
        "\\n",
        span, "DIVIDER", espan,
        "\\n",
        span, Regexp.escape(CGI.escapeHTML(back)), espan,
        "\\n*"
      ].join, Regexp::MULTILINE
    )
  end

  code_stream = code_blocks.join(divider_input)

  code_html =
    if pygmentize?
      highlight_pygmentize(code_stream)
    else
      highlight_webservice(code_stream)
    end

  # Do some post-processing on the pygments output to split things back
  # into sections and remove partial `<pre>` blocks.
  code_html = code_html.
    split(divider_output).
    map { |code| code.sub(/\n?<div class="highlight"><pre>/m, '') }.
    map { |code| code.sub(/\n?<\/pre><\/div>\n/m, '') }

  # Lastly, combine the docs and code lists back into a list of two-tuples.
  docs_html.zip(code_html)
end

#highlight_pygmentize(code) ⇒ Object

We ‘popen` a read/write pygmentize process in the parent and then fork off a child process to write the input.



497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
# File 'lib/rocco.rb', line 497

def highlight_pygmentize(code)
  code_html = nil
  open("|pygmentize -l #{@options[:language]} -O encoding=utf-8 -f html", 'r+') do |fd|
    pid =
      fork {
        fd.close_read
        fd.write code
        fd.close_write
        exit!
      }
    fd.close_write
    code_html = fd.read
    fd.close_read
    Process.wait(pid)
  end

  code_html
end

#highlight_webservice(code) ⇒ Object

Pygments is not one of those things that’s trivial for a ruby user to install, so we’ll fall back on a webservice to highlight the code if it isn’t available.



518
519
520
521
522
523
# File 'lib/rocco.rb', line 518

def highlight_webservice(code)
  Net::HTTP.post_form(
    URI.parse('http://pygments.appspot.com/'),
    {'lang' => @options[:language], 'code' => code}
  ).body
end

#normalize_leading_spaces(sections) ⇒ Object

Normalizes documentation whitespace by checking for leading whitespace, removing it, and then removing the same amount of whitespace from each succeeding line. That is:

def func():
  """
    Comment 1
    Comment 2
  """
  print "omg!"

should yield a comment block of ‘Comment 1nComment 2` and code of `def func():n print “omg!”`



388
389
390
391
392
393
394
395
396
397
398
399
# File 'lib/rocco.rb', line 388

def normalize_leading_spaces(sections)
  sections.map do |section|
    if section.any? && section[0].any?
      leading_space = section[0][0].match("^\s+")
      if leading_space
        section[0] =
          section[0].map{ |line| line.sub(/^#{leading_space.to_s}/, '') }
      end
    end
    section
  end
end

#parse(data) ⇒ Object

Parse the raw file data into a list of two-tuples. Each tuple has the form ‘[docs, code]` where both elements are arrays containing the raw lines parsed from the input file, comment characters stripped.



276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
# File 'lib/rocco.rb', line 276

def parse(data)
  sections = []
  docs, code = [], []
  lines = data.split("\n")

  # The first line is ignored if it is a shebang line.  We also ignore the
  # PEP 263 encoding information in python sourcefiles, and the similar ruby
  # 1.9 syntax.
  lines.shift if lines[0] =~ /^\#\!/
  lines.shift if lines[0] =~ /coding[:=]\s*[-\w.]+/ &&
                 [ "python", "rb" ].include?(@options[:language])

  # To detect both block comments and single-line comments, we'll set
  # up a tiny state machine, and loop through each line of the file.
  # This requires an `in_comment_block` boolean, and a few regular
  # expressions for line tests.  We'll do the same for fake heredoc parsing.
  in_comment_block = false
  in_heredoc = false
  single_line_comment, block_comment_start, block_comment_mid, block_comment_end =
    nil, nil, nil, nil
  if not @options[:comment_chars][:single].nil?
    single_line_comment = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:single])}\\s?")
  end
  if not @options[:comment_chars][:multi].nil?
    block_comment_start = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:start])}\\s*$")
    block_comment_end   = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:end])}\\s*$")
    block_comment_one_liner = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:start])}\\s*(.*?)\\s*#{Regexp.escape(@options[:comment_chars][:multi][:end])}\\s*$")
    block_comment_start_with = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:start])}\\s*(.*?)$")
    block_comment_end_with = Regexp.new("\\s*(.*?)\\s*#{Regexp.escape(@options[:comment_chars][:multi][:end])}\\s*$")
    if @options[:comment_chars][:multi][:middle]
      block_comment_mid = Regexp.new("^\\s*#{Regexp.escape(@options[:comment_chars][:multi][:middle])}\\s?")
    end
  end
  if not @options[:comment_chars][:heredoc].nil?
    heredoc_start = Regexp.new("#{Regexp.escape(@options[:comment_chars][:heredoc])}(\\S+)$")
  end
  lines.each do |line|
    # If we're currently in a comment block, check whether the line matches
    # the _end_ of a comment block or the _end_ of a comment block with a
    # comment.
    if in_comment_block
      if block_comment_end && line.match(block_comment_end)
        in_comment_block = false
      elsif block_comment_end_with && line.match(block_comment_end_with)
        in_comment_block = false
        docs << line.match(block_comment_end_with).captures.first.
                      sub(block_comment_mid || '', '')
      else
        docs << line.sub(block_comment_mid || '', '')
      end
    # If we're currently in a heredoc, we're looking for the end of the
    # heredoc, and everything it contains is code.
    elsif in_heredoc
      if line.match(Regexp.new("^#{Regexp.escape(in_heredoc)}$"))
        in_heredoc = false
      end
      code << line
    # Otherwise, check whether the line starts a heredoc. If so, note the end
    # pattern, and the line is code.  Otherwise check whether the line matches
    # the beginning of a block, or a single-line comment all on it's lonesome.
    # In either case, if there's code, start a new section.
    else
      if heredoc_start && line.match(heredoc_start)
        in_heredoc = $1
        code << line
      elsif block_comment_one_liner && line.match(block_comment_one_liner)
        if code.any?
          sections << [docs, code]
          docs, code = [], []
        end
        docs << line.match(block_comment_one_liner).captures.first
      elsif block_comment_start && line.match(block_comment_start)
        in_comment_block = true
        if code.any?
          sections << [docs, code]
          docs, code = [], []
        end
      elsif block_comment_start_with && line.match(block_comment_start_with)
        in_comment_block = true
        if code.any?
          sections << [docs, code]
          docs, code = [], []
        end
        docs << line.match(block_comment_start_with).captures.first
      elsif single_line_comment && line.match(single_line_comment)
        if code.any?
          sections << [docs, code]
          docs, code = [], []
        end
        docs << line.sub(single_line_comment || '', '')
      else
        code << line
      end
    end
  end
  sections << [docs, code] if docs.any? || code.any?
  normalize_leading_spaces(sections)
end

#pygmentize?Boolean

Returns ‘true` if `pygmentize` is available locally, `false` otherwise.

Returns:

  • (Boolean)


159
160
161
162
# File 'lib/rocco.rb', line 159

def pygmentize?
  @_pygmentize ||= ENV['PATH'].split(':').
    any? { |dir| File.executable?("#{dir}/pygmentize") }
end

#split(sections) ⇒ Object

Take the list of paired sections two-tuples and split into two separate lists: one holding the comments with leaders removed and one with the code blocks.



404
405
406
407
408
409
410
411
412
413
414
# File 'lib/rocco.rb', line 404

def split(sections)
  docs_blocks, code_blocks = [], []
  sections.each do |docs,code|
    docs_blocks << docs.join("\n")
    code_blocks << code.map do |line|
      tabs = line.match(/^(\t+)/)
      tabs ? line.sub(/^\t+/, '  ' * tabs.captures[0].length) : line
    end.join("\n")
  end
  [docs_blocks, code_blocks]
end

#to_htmlObject



151
152
153
# File 'lib/rocco.rb', line 151

def to_html
  Rocco::Layout.new(self, @options[:template_file]).render
end