Class: Bio::AssemblyGraphAlgorithms::SingleCoherentPathsBetweenNodesFinder::CycleCounter

Inherits:
Object
  • Object
show all
Includes:
FinishM::Logging
Defined in:
lib/assembly/single_coherent_paths_between_nodes.rb

Overview

Count occurrences of cycles in paths through an assembly graph. Works by building a hash of paths and the frequency of the modal cycle in that path (up to the cut-off max_cycles). For an unknown path, looks for a subset of the path in hash by removing nodes from start (or end if :forward option is set), and then extends the subset by iteratively re-adding a single node and adding to the hash of paths the larger of the subset count or the frequency for the modal cycle beginning with the re-added node.

Instance Method Summary collapse

Methods included from FinishM::Logging

#log

Constructor Details

#initialize(max_cycles, options = {}) ⇒ CycleCounter

Returns a new instance of CycleCounter.



378
379
380
381
382
# File 'lib/assembly/single_coherent_paths_between_nodes.rb', line 378

def initialize(max_cycles, options = {})
  @max_cycles = max_cycles
  @path_cache = Hash.new # Cache max_cycles for previously seen paths
  @forward = options[:forward] || false # By default builds hash assuming backtracking from end of path. This flag will reverse path direction and build hash assuming moving forwards.
end

Instance Method Details

#path_cycle_count(path) ⇒ Object

Iterate through unique nodes of path and find maximal cycle counts



386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
# File 'lib/assembly/single_coherent_paths_between_nodes.rb', line 386

def path_cycle_count(path)
  log.debug "Finding cycles in path #{path.collect{|onode| onode.node.node_id}.join(',')}." if log.debug?
  first_part = []
  second_part = path
  keys = []
  count = nil
  reached_max_cycles = false

  second_part = second_part.reverse if @forward

  # Iterate along path and look for the remaining path in cache. Remember the iterated
  # path and the remaining path. Stop if a cache count is found, else use zero.
  while count.nil? and !second_part.empty?
    key = second_part.collect{|onode| onode.to_settable}.flatten

    # Check if path value is cached
    if @path_cache.has_key? key
      count = @path_cache[key]
      #log.debug "Found cached count #{count} for path #{second_part.collect{|onode| onode.node.node_id}.join(',')}." if log.debug?
      break
    else
      first_part = [first_part, second_part.first].flatten
      second_part = second_part[1..-1]
    end
  end

  if second_part.empty?
    #log.debug "Reached end of path without finding cached count." if log.debug?
    count = 0
  end

  # The max cycle count for a path is the largest of:
  #   I. Cycle count for initial node of path in remaining path (without initial
  #      node), or
  #   II. Max cycle count of remaining path.

  # We then iterate back through the iterated path. If count does not exceed
  # max_cycles, we count cycles for each node in the remaining path, then
  # backtrack by moving the node to the remaining path set. We record the count
  # for each remaining path
  while !first_part.empty?

    node = first_part.last
    if !reached_max_cycles
      #log.debug "Next node is #{node.node.node_id}." if log.debug?
      node_count = path_cycle_count_for_node(node, second_part, @max_cycles)
      count = [count, node_count].max
      reached_max_cycles = count > @max_cycles
    end

    second_part = [node, second_part].flatten
    first_part = first_part[0...-1]

    key = second_part.collect{|onode| onode.to_settable}.flatten
    @path_cache[key] = count
    #log.debug "Caching cycle count #{count} for path #{second_part.collect{|onode| onode.node.node_id}.join(',')}." if log.debug?
  end
  if reached_max_cycles and log.debug?
    log.debug "Most repeated cycle in path occured #{count} or more times."
  elsif log.debug?
    log.debug "Most repeated cycle in path occured #{count} times."
  end
  return count
end

#path_cycle_count_for_node(node, path, max_cycles = 1) ⇒ Object

For an initial node, find and count unique ‘simple’ cycles in a path that begin at the initial node, up to a max_cycles. Return count for the maximally repeated cycle if less than max_cycles, or max_cycles.



454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
# File 'lib/assembly/single_coherent_paths_between_nodes.rb', line 454

def path_cycle_count_for_node(node, path, max_cycles=1)
  #log.debug "Finding all simple cycles for node #{node.node_id} in path #{path.collect{|onode| onode.node.node_id}.join(',')}." if log.debug?
  remaining = path
  cycles = Hash.new

  remaining = remaining.reverse if @forward

  while remaining.include?(node)
    position = remaining.index(node)
    cycle = remaining[0..position]
    remaining = remaining[(position+1)..-1]
    #log.debug "Found cycle: #{cycle.collect{|onode| onode.node.node_id}.join(',')}." if log.debug?

    set_key = cycle.collect{|onode| onode.to_settable}.flatten
    cycles[set_key] ||= 0
    cycles[set_key] += 1
    #log.debug "Found repeat #{cycles[set_key]}." if log.debug?

    if cycles[set_key] > max_cycles
      #log.debug "Max cycles #{max_cycles} exceeded." if log.debug?
      return cycles[set_key]
    end
  end
  if cycles.empty?
    max_counts = 0
  else
    max_counts = cycles.values.max
  end
  #log.debug "Most cycles found #{max_counts}." if log.debug?
  return max_counts
end