Module: TSort

Defined in:
lib/rubysl/tsort/tsort.rb

Overview

TSort implements topological sorting using Tarjan’s algorithm for strongly connected components.

TSort is designed to be able to be used with any object which can be interpreted as a directed graph.

TSort requires two methods to interpret an object as a graph, tsort_each_node and tsort_each_child.

  • tsort_each_node is used to iterate for all nodes over a graph.

  • tsort_each_child is used to iterate for child nodes of a given node.

The equality of nodes are defined by eql? and hash since TSort uses Hash internally.

A Simple Example

The following example demonstrates how to mix the TSort module into an existing class (in this case, Hash). Here, we’re treating each key in the hash as a node in the graph, and so we simply alias the required #tsort_each_node method to Hash’s #each_key method. For each key in the hash, the associated value is an array of the node’s child nodes. This choice in turn leads to our implementation of the required #tsort_each_child method, which fetches the array of child nodes and then iterates over that array using the user-supplied block.

require 'tsort'

class Hash
  include TSort
  alias tsort_each_node each_key
  def tsort_each_child(node, &block)
    fetch(node).each(&block)
  end
end

{1=>[2, 3], 2=>[3], 3=>[], 4=>[]}.tsort
#=> [3, 2, 1, 4]

{1=>[2], 2=>[3, 4], 3=>[2], 4=>[]}.strongly_connected_components
#=> [[4], [2, 3], [1]]

A More Realistic Example

A very simple ‘make’ like tool can be implemented as follows:

require 'tsort'

class Make
  def initialize
    @dep = {}
    @dep.default = []
  end

  def rule(outputs, inputs=[], &block)
    triple = [outputs, inputs, block]
    outputs.each {|f| @dep[f] = [triple]}
    @dep[triple] = inputs
  end

  def build(target)
    each_strongly_connected_component_from(target) {|ns|
      if ns.length != 1
        fs = ns.delete_if {|n| Array === n}
        raise TSort::Cyclic.new("cyclic dependencies: #{fs.join ', '}")
      end
      n = ns.first
      if Array === n
        outputs, inputs, block = n
        inputs_time = inputs.map {|f| File.mtime f}.max
        begin
          outputs_time = outputs.map {|f| File.mtime f}.min
        rescue Errno::ENOENT
          outputs_time = nil
        end
        if outputs_time == nil ||
           inputs_time != nil && outputs_time <= inputs_time
          sleep 1 if inputs_time != nil && inputs_time.to_i == Time.now.to_i
          block.call
        end
      end
    }
  end

  def tsort_each_child(node, &block)
    @dep[node].each(&block)
  end
  include TSort
end

def command(arg)
  print arg, "\n"
  system arg
end

m = Make.new
m.rule(%w[t1]) { command 'date > t1' }
m.rule(%w[t2]) { command 'date > t2' }
m.rule(%w[t3]) { command 'date > t3' }
m.rule(%w[t4], %w[t1 t3]) { command 'cat t1 t3 > t4' }
m.rule(%w[t5], %w[t4 t2]) { command 'cat t4 t2 > t5' }
m.build('t5')

Bugs

  • ‘tsort.rb’ is wrong name because this library uses Tarjan’s algorithm for strongly connected components. Although ‘strongly_connected_components.rb’ is correct but too long.

References

    1. Tarjan, “Depth First Search and Linear Graph Algorithms”,

SIAM Journal on Computing, Vol. 1, No. 2, pp. 146-160, June 1972.

Defined Under Namespace

Classes: Cyclic

Instance Method Summary collapse

Instance Method Details

#each_strongly_connected_componentObject

The iterator version of the #strongly_connected_components method. obj.each_strongly_connected_component is similar to obj.strongly_connected_components.each, but modification of obj during the iteration may lead to unexpected results.

#each_strongly_connected_component returns nil.



177
178
179
180
181
182
183
184
185
186
187
188
# File 'lib/rubysl/tsort/tsort.rb', line 177

def each_strongly_connected_component # :yields: nodes
  id_map = {}
  stack = []
  tsort_each_node {|node|
    unless id_map.include? node
      each_strongly_connected_component_from(node, id_map, stack) {|c|
        yield c
      }
    end
  }
  nil
end

#each_strongly_connected_component_from(node, id_map = {}, stack = []) ⇒ Object

Iterates over strongly connected component in the subgraph reachable from node.

Return value is unspecified.

#each_strongly_connected_component_from doesn’t call #tsort_each_node.



198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
# File 'lib/rubysl/tsort/tsort.rb', line 198

def each_strongly_connected_component_from(node, id_map={}, stack=[]) # :yields: nodes
  minimum_id = node_id = id_map[node] = id_map.size
  stack_length = stack.length
  stack << node

  tsort_each_child(node) {|child|
    if id_map.include? child
      child_id = id_map[child]
      minimum_id = child_id if child_id && child_id < minimum_id
    else
      sub_minimum_id =
        each_strongly_connected_component_from(child, id_map, stack) {|c|
          yield c
        }
      minimum_id = sub_minimum_id if sub_minimum_id < minimum_id
    end
  }

  if node_id == minimum_id
    component = stack.slice!(stack_length .. -1)
    component.each {|n| id_map[n] = nil}
    yield component
  end

  minimum_id
end

#strongly_connected_componentsObject

Returns strongly connected components as an array of arrays of nodes. The array is sorted from children to parents. Each elements of the array represents a strongly connected component.



162
163
164
165
166
# File 'lib/rubysl/tsort/tsort.rb', line 162

def strongly_connected_components
  result = []
  each_strongly_connected_component {|component| result << component}
  result
end

#tsortObject

Returns a topologically sorted array of nodes. The array is sorted from children to parents, i.e. the first element has no child and the last node has no parent.

If there is a cycle, TSort::Cyclic is raised.



133
134
135
136
137
# File 'lib/rubysl/tsort/tsort.rb', line 133

def tsort
  result = []
  tsort_each {|element| result << element}
  result
end

#tsort_eachObject

The iterator version of the #tsort method. obj.tsort_each is similar to obj.tsort.each, but modification of obj during the iteration may lead to unexpected results.

#tsort_each returns nil. If there is a cycle, TSort::Cyclic is raised.



147
148
149
150
151
152
153
154
155
# File 'lib/rubysl/tsort/tsort.rb', line 147

def tsort_each # :yields: node
  each_strongly_connected_component {|component|
    if component.size == 1
      yield component.first
    else
      raise Cyclic.new("topological sort failed: #{component.inspect}")
    end
  }
end

#tsort_each_child(node) ⇒ Object

Should be implemented by a extended class.

#tsort_each_child is used to iterate for child nodes of node.

Raises:

  • (NotImplementedError)


239
240
241
# File 'lib/rubysl/tsort/tsort.rb', line 239

def tsort_each_child(node) # :yields: child
  raise NotImplementedError.new
end

#tsort_each_nodeObject

Should be implemented by a extended class.

#tsort_each_node is used to iterate for all nodes over a graph.

Raises:

  • (NotImplementedError)


230
231
232
# File 'lib/rubysl/tsort/tsort.rb', line 230

def tsort_each_node # :yields: node
  raise NotImplementedError.new
end