Class: Webgen::PathHandler

Inherits:
Object
  • Object
show all
Includes:
ExtensionManager
Defined in:
lib/webgen/path_handler.rb,
lib/webgen/path_handler/api.rb,
lib/webgen/path_handler/base.rb,
lib/webgen/path_handler/copy.rb,
lib/webgen/path_handler/feed.rb,
lib/webgen/path_handler/page.rb,
lib/webgen/path_handler/sitemap.rb,
lib/webgen/path_handler/virtual.rb,
lib/webgen/path_handler/template.rb,
lib/webgen/path_handler/directory.rb,
lib/webgen/path_handler/meta_info.rb,
lib/webgen/path_handler/page_utils.rb

Overview

Namespace for all path handlers.

About

A path handler is a webgen extension that uses source Path objects to create Node objects and that provides methods for rendering these nodes. The nodes are stored in a hierarchy, the root of which is a Tree object. Path handlers can do simple things, like copying a path from the source to the destination, or a complex things, like generating a whole set of nodes from one input path (e.g. generating a whole image gallery)!

The paths that are handled by a path handler are generally specified via path patterns. The #create_nodes method of a path handler is called for each source path that should be handled. And when it is time to write out a node, the #content method on the path handler associated with the node is called to retrieve the rendered content of the node.

Tree creation

The method #populate_tree is used for creating the initial node tree, the internal representation of all paths. It is only the initial tree because it is possible that additional, secondary nodes are created during the rendering phase by using the #create_secondary_nodes method.

Tree creation works like this:

  1. All path handlers on the invocation list are used in turn. The order is important; it allows avoiding unnecessary write phases and it makes sure that, for example, directory nodes are created before their file nodes.

  2. When a path handler is used for creating nodes, all source paths (retrieved by using Webgen::Source#paths method) that match one of the associated patterns and/or all path with the ‘handler’ meta information set to the path handler are used.

  3. The meta information of a used source path is then updated with the meta information applied by methods registered for the :apply_meta_info_to_path blackboard message.

    After that the source path is given to the #parse_meta_info! method of the path handler so that meta information of the path can be updated with meta information stored in the content of the path itself.

    Then the meta information ‘versions’ is used to determine if multiple version of the path should be used for creating nodes and each path version is then given to the #create_nodes method of the path handler so that it can create one or more nodes.

  4. Nodes returned by #creates_nodes of a path handler are assumed to have the Node#node_info keys :path and :path_handler and the meta info key ‘modified_at’ correctly set (this is automatically done if the Webgen::PathHandler::Base#create_node method is used).

Path Patterns and Invocation order

Path patterns define which paths are handled by a specific path handler. These patterns are specified when a path handler is registered using #register method. The patterns need to have a format that Dir.glob can handle. Note that a user can always associate any path with a path handler through a meta information path and the ‘handler’ meta information key.

In addition to specifying the patterns a path handler uses, one can also specify the place in the invocation list which the path handler should use. The invocation list is used from the front to the back when the Tree is created.

Implementing a path handler

A path handler must take the website as the only parameter on initialization and needs to define the following methods:

parse_meta_info!(path)

Update path.meta_info with meta information found in the content of the path. The return values of this method are given to the #create_nodes method as additional parameters!

This allows one to use a single pass for reading the meta information and the normal content of the path.

#create_nodes(path, …)

Create one or more nodes from the path and return them. If #parse_meta_info! returns one or more values, these values are provided as additional parameters to this method.

It is a good idead to use the helper method Webgen::PathHandler::Base#create_node for actually creating a node.

#content(node)

Return the content of the given node. This method is only called for nodes that have been created by the path handler.

Also note that a path handler does not need to reside under the Webgen::PathHandler namespace but all built-in ones do so that auto-loading of the path handlers works.

The Webgen::PathHandler::Base module provides default implementations of the needed methods (except for #create_nodes) and should be used by all path handlers! If a path handler processes paths in Webgen Page Format, it should probably also use Webgen::PathHandler::PageUtils.

Information that is used by a path handler only for processing purposes should be stored in the #node_info hash of a node as the #meta_info hash is reserved for user provided node meta information.

Following is a simple path handler class example which copies paths from the source to the destination and modifies the extension in the process:

class SimpleCopy

  include Webgen::PathHandler::Base

  def create_nodes(path)
    path.ext += '.copied'
    create_node(path)
  end

  def content(node)
    node.node_info[:path]
  end

end

website.ext.path_handler.register(SimpleCopy, patterns: ['**/*.jpg', '**/*.png'])

Defined Under Namespace

Modules: Base, PageUtils Classes: Api, Copy, Directory, Feed, MetaInfo, Page, Sitemap, Template, Virtual

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods included from ExtensionManager

#initialize_copy, #registered?, #registered_extensions

Constructor Details

#initialize(website) ⇒ PathHandler

Create a new path handler object for the given website.



134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
# File 'lib/webgen/path_handler.rb', line 134

def initialize(website)
  super()
  @website = website
  @current_dest_node = nil
  @invocation_order = []
  @instances = {}
  @secondary_nodes = {}

  @website.blackboard.add_listener(:website_generated, 'path_handler') do
    @website.cache[:path_handler_secondary_nodes] = @secondary_nodes
  end

  used_secondary_paths = {}
  written_nodes = Set.new
  @website.blackboard.add_listener(:before_secondary_nodes_created, 'path_handler') do |path, source_alcn|
    (used_secondary_paths[source_alcn] ||= Set.new) << path if source_alcn
  end
  @website.blackboard.add_listener(:before_all_nodes_written, 'path_handler') do |node|
    used_secondary_paths = {}
    written_nodes = Set.new
  end
  @website.blackboard.add_listener(:after_node_written, 'path_handler') do |node|
    written_nodes << node.alcn
  end
  @website.blackboard.add_listener(:after_all_nodes_written, 'path_handler') do
    @secondary_nodes.delete_if do |path, data|
      if written_nodes.include?(data[1]) && (!used_secondary_paths[data[1]] ||
                                             !used_secondary_paths[data[1]].include?(path))
        data[2].each {|alcn| @website.tree.delete_node(@website.tree[alcn])}
        true
      end
    end
  end
end

Instance Attribute Details

#current_dest_nodeObject (readonly)

The destination node if one is currently written (only during the invocation of #write_tree) or nil otherwise.



131
132
133
# File 'lib/webgen/path_handler.rb', line 131

def current_dest_node
  @current_dest_node
end

Instance Method Details

#create_secondary_nodes(path, content = '', source_alcn = nil) ⇒ Object

Create nodes for the given path (a Path object which must not be a source path).

The content of the path also needs to be specified. Note that if an IO block is associated with the path, it is discarded!

If the parameter handler is present, nodes from the given path are only created with the specified handler.

If the secondary nodes are created during the rendering phase (and not during node creation, ie. in a #create_nodes method of a path handler), the source_alcn has to be set to the node alcn from which these nodes are created!



342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
# File 'lib/webgen/path_handler.rb', line 342

def create_secondary_nodes(path, content = '', source_alcn = nil)
  if (sn = @secondary_nodes[path]) && sn[1] != source_alcn
    raise Webgen::NodeCreationError.new("Duplicate secondary path name <#{path}>", 'path_handler', path)
  end
  @website.blackboard.dispatch_msg(:before_secondary_nodes_created, path, source_alcn)

  path['modified_at'] ||= @website.tree[source_alcn]['modified_at'] if source_alcn
  path.set_io { StringIO.new(content) }

  nodes = if path['handler']
            @website.blackboard.dispatch_msg(:apply_meta_info_to_path, path)
            create_nodes_with_path_handler(path, path['handler'])
          else
            create_nodes([path])
          end
  @website.blackboard.dispatch_msg(:after_secondary_nodes_created, path, nodes)

  if source_alcn
    path.set_io(&nil)
    _, _, stored_alcns = @secondary_nodes.delete(path)
    cur_alcns = nodes.map {|n| n.alcn}
    (stored_alcns - cur_alcns).each {|n| @website.tree.delete_node(@website.tree[n])} if stored_alcns
    @secondary_nodes[path.dup] = [content, source_alcn, cur_alcns]
  end

  nodes
end

#instance(handler) ⇒ Object

Return the instance of the path handler class with the given name.



213
214
215
# File 'lib/webgen/path_handler.rb', line 213

def instance(handler)
  @instances[handler.intern] ||= extension(handler).new(@website)
end

#populate_treeObject

Populate the website tree with nodes.

Can only be called once because the tree can only be populated once!



227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
# File 'lib/webgen/path_handler.rb', line 227

def populate_tree
  raise Webgen::NodeCreationError.new("Can't populate tree twice", 'path_handler') if @website.tree.root

  time = Benchmark.measure do
    meta_info, rest = @website.ext.source.paths.partition {|path| path.path =~ /[\/.]metainfo$/}

    used_paths = []

    @website.blackboard.add_listener(:before_node_created, 'path_handler (temp_populate_tree)') do |path|
      used_paths << path
    end
    create_nodes(meta_info, [:meta_info])
    create_nodes(rest)
    @website.blackboard.remove_listener(:before_node_created, 'path_handler (temp_populate_tree)')

    unused_paths = rest - used_paths
    @website.logger.vinfo do
      "The following source paths have not been used: #{unused_paths.join(', ')}"
    end if unused_paths.length > 0

    (@website.cache[:path_handler_secondary_nodes] || {}).each do |path, (content, source_alcn, _)|
      next if !@website.tree[source_alcn]
      create_secondary_nodes(path, content, source_alcn)
    end
  end
  @website.logger.vinfo do
    "Populating node tree took " << ('%2.2f' % time.real) << ' seconds'
  end

  @website.blackboard.dispatch_msg(:after_tree_populated)
end

#register(klass, options = {}, &block) ⇒ Object

Register a path handler.

The parameter klass has to contain the name of the path handler class or the class object itself. If the class is located under this namespace, only the class name without the hierarchy part is needed, otherwise the full class name including parent module/class names is needed.

Options:

:name

The name for the path handler. If not set, it defaults to the snake-case version of the class name (without the hierarchy part). It should only contain letters.

:patterns

A list of path patterns for which the path handler should be used. If not specified, defaults to an empty list.

:insert_at

Specifies the position in the invocation list. If not specified or if :end is specified, the handler is added to the end of the list. If :front is specified, it is added to the beginning of the list. Otherwise the value is expected to be a position number and the path handler is added at the specified position in the list.

Examples:

path_handler.register('Template')     # registers Webgen::PathHandler::Template

path_handler.register('::Template')   # registers Template !!!

path_handler.register('MyModule::Doit', name: 'template', patterns: ['**/*.template'])


198
199
200
201
202
203
204
205
206
207
208
209
210
# File 'lib/webgen/path_handler.rb', line 198

def register(klass, options={}, &block)
  name = do_register(klass, options, false, &block)
  ext_data(name).patterns = options[:patterns] || []
  pos = if options[:insert_at].nil? || options[:insert_at] == :end
          -1
        elsif options[:insert_at] == :front
          0
        else
          options[:insert_at].to_i
        end
  @invocation_order.delete(name)
  @invocation_order.insert(pos, name)
end

#write_treeObject

Write all changed nodes of the website tree to their respective destination using the Destination object at website.ext.destination.

Returns the number of passes needed for correctly writing out all paths.



263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
# File 'lib/webgen/path_handler.rb', line 263

def write_tree
  passes = 0
  content = nil

  begin
    at_least_one_node_written = false
    @website.cache.reset_volatile_cache
    @website.blackboard.dispatch_msg(:before_all_nodes_written)
    @website.tree.node_access[:alcn].sort_by {|a, n| [n['write_order'].to_s, a]}.each do |name, node|
      begin
        next if node == @website.tree.dummy_root ||
          (node['passive'] && !node['no_output'] && !@website.ext.item_tracker.node_referenced?(node)) ||
          ((@website.config['website.dry_run'] || node['no_output'] || @website.ext.destination.exists?(node.dest_path)) &&
           !@website.ext.item_tracker.node_changed?(node))

        @website.blackboard.dispatch_msg(:before_node_written, node)
        if !node['no_output']
          content = write_node(node)
          at_least_one_node_written = true
        end
        @website.blackboard.dispatch_msg(:after_node_written, node, content)
      rescue Webgen::Error => e
        e.path = node.alcn if e.path.to_s.empty?
        e.location = "path_handler.#{name_of_instance(node.node_info[:path_handler])}" unless e.location
        raise
      rescue Exception => e
        raise Webgen::RenderError.new(e, "path_handler.#{name_of_instance(node.node_info[:path_handler])}", node)
      end
    end
    @website.blackboard.dispatch_msg(:after_all_nodes_written)
    passes += 1 if at_least_one_node_written
  end while at_least_one_node_written

  @website.blackboard.dispatch_msg(:website_generated)
  passes
end