Class: Bio::AssemblyGraphAlgorithms::BubblyAssembler::Bubble
- Inherits:
-
Object
- Object
- Bio::AssemblyGraphAlgorithms::BubblyAssembler::Bubble
- Includes:
- FinishM::Logging
- Defined in:
- lib/assembly/bubbly_assembler.rb
Overview
Tim - use ‘waiting train’ algorithm (made up by me). Problems collect the nodes they visit, adding them to hashes of ‘ubiquitous’ and ‘visited’ nodes (metaphor: ‘train’ (problem) visiting ‘stations’ (nodes)). Each time a problem is dequeued, new problems are enqueued for all neighbours to the problem node (metaphor: ‘trains’ (problems) magically duplicate for each path to a new ‘station’ (node) (methaphor breaks a bit here)). At each step the algorithm dequeues a problem, prioritising problems by shortest distance of any path to the problem node, meaning if a a problem is enqueued for a node that is already known, then that problem is prioritised (metaphor: when a train leaves a station (problem is deqeued) other ‘trains’ will wait in case it catches up, or otherwise reaches a more distant station). If a new problem is enqueued for a problem node that is currently in enqueued, the new problem is added to known problems removed from queue, and when a problem is dequeued, its ubiquitous and visited nodes are set to the ubiquitous and visited nodes of all known problems for the node (metaphor: the carriages of all trains at a station are merged into one train). Cycles occur when a problem reaches a node that is in its visited nodes hash (metaphor: a station that one of the train carriages has previously visited). Queued cyclic problems are added to known problems and then dropped. Bubble is converged when all current problems have a ubiquitous node in common (metaphor: all carriages of all current trains have visited a station).
Instance Attribute Summary collapse
-
#converging_oriented_node_settable ⇒ Object
readonly
The DynamicProgrammingProblem this bubble converges on.
-
#is_reverse ⇒ Object
readonly
The DynamicProgrammingProblem this bubble converges on.
-
#num_legit_forks ⇒ Object
how many legit forks have been explored.
-
#root ⇒ Object
readonly
The DynamicProgrammingProblem this bubble converges on.
Instance Method Summary collapse
-
#[](index) ⇒ Object
This doesn’t make sense unless this is a converged bubble and the index == -1 because otherwise there is multiple answers.
-
#circuitous? ⇒ Boolean
Does this (coverged) bubble contain any circuits?.
-
#converge_on(dynamic_programming_problem) ⇒ Object
Finish off the bubble, assuming convergent_on? the given problem == true.
- #converged? ⇒ Boolean
-
#convergent_on?(dynamic_programming_problem) ⇒ Boolean
return true if the given problem converges the bubble, else false.
-
#converging_oriented_node ⇒ Object
Return the OrientedNode that converges this bubble, behaviour undefined if bubble is not converged.
-
#coverage ⇒ Object
Coverage of a bubble is the coverage of each node in the bubble each weighted by their length.
-
#each_path(options = {}) ⇒ Object
Iterate over the paths returning each as an OrientedNodeTrail.
- #enqueue(dynamic_programming_problem) ⇒ Object
-
#initialize(bubble_root, options = {}) ⇒ Bubble
constructor
A new instance of Bubble.
- #num_known_problems ⇒ Object
-
#oriented_nodes ⇒ Object
yield or failing that return an Array of the list of oriented_nodes found in at least one path in this (presumed converged) bubble.
- #paths ⇒ Object
-
#reference_trail(max_cycles = @max_cycles) ⇒ Object
Return one trail that exemplifies the paths through this bubble.
- #reverse! ⇒ Object
-
#shift ⇒ Object
Return the next closest dynamic programming problem, removing it from the bubble.
- #shortest_problem_distance(prob) ⇒ Object
- #to_shorthand ⇒ Object
- #ubiquitous_oriented_nodes(prob) ⇒ Object
- #visited_oriented_nodes(prob) ⇒ Object
Methods included from FinishM::Logging
Constructor Details
#initialize(bubble_root, options = {}) ⇒ Bubble
Returns a new instance of Bubble.
496 497 498 499 500 501 502 503 |
# File 'lib/assembly/bubbly_assembler.rb', line 496 def initialize(bubble_root, = {}) @queue = DS::AnyPriorityQueue.new {|a,b| a<=b} @known_problems = {} @current_problems = Set.new @num_legit_forks = 0 @max_cycles = [:max_cycles] || DEFAULT_MAX_CYCLES @root = bubble_root end |
Instance Attribute Details
#converging_oriented_node_settable ⇒ Object (readonly)
The DynamicProgrammingProblem this bubble converges on
491 492 493 |
# File 'lib/assembly/bubbly_assembler.rb', line 491 def converging_oriented_node_settable @converging_oriented_node_settable end |
#is_reverse ⇒ Object (readonly)
The DynamicProgrammingProblem this bubble converges on
491 492 493 |
# File 'lib/assembly/bubbly_assembler.rb', line 491 def is_reverse @is_reverse end |
#num_legit_forks ⇒ Object
how many legit forks have been explored
494 495 496 |
# File 'lib/assembly/bubbly_assembler.rb', line 494 def num_legit_forks @num_legit_forks end |
#root ⇒ Object (readonly)
The DynamicProgrammingProblem this bubble converges on
491 492 493 |
# File 'lib/assembly/bubbly_assembler.rb', line 491 def root @root end |
Instance Method Details
#[](index) ⇒ Object
This doesn’t make sense unless this is a converged bubble and the index == -1 because otherwise there is multiple answers
759 760 761 762 763 764 765 |
# File 'lib/assembly/bubbly_assembler.rb', line 759 def [](index) raise unless index == -1 return Bio::Velvet::Graph::OrientedNodeTrail::OrientedNode.new( @converging_oriented_node_settable[0], @converging_oriented_node_settable[1] ) end |
#circuitous? ⇒ Boolean
Does this (coverged) bubble contain any circuits?
787 788 789 790 791 792 793 794 |
# File 'lib/assembly/bubbly_assembler.rb', line 787 def circuitous? raise unless converged? if @circuitous.nil? each_path({:max_cycles => 0}) {|| break if @circuitous} @circuitous ||= false end @circuitous end |
#converge_on(dynamic_programming_problem) ⇒ Object
Finish off the bubble, assuming convergent_on? the given problem == true
562 563 564 565 566 567 |
# File 'lib/assembly/bubbly_assembler.rb', line 562 def converge_on(dynamic_programming_problem) @converging_oriented_node_settable = dynamic_programming_problem.to_settable #free some memory @queue = nil @current_problems = nil end |
#converged? ⇒ Boolean
730 731 732 |
# File 'lib/assembly/bubbly_assembler.rb', line 730 def converged? !@converging_oriented_node_settable.nil? end |
#convergent_on?(dynamic_programming_problem) ⇒ Boolean
return true if the given problem converges the bubble, else false
552 553 554 555 556 557 558 559 |
# File 'lib/assembly/bubbly_assembler.rb', line 552 def convergent_on?(dynamic_programming_problem) settable = dynamic_programming_problem.to_settable @queue.each do |problem| #convergent until not return false unless ubiquitous_oriented_nodes(problem).include? settable end return true end |
#converging_oriented_node ⇒ Object
Return the OrientedNode that converges this bubble, behaviour undefined if bubble is not converged
736 737 738 |
# File 'lib/assembly/bubbly_assembler.rb', line 736 def converging_oriented_node @known_problems[@converging_oriented_node_settable][0].path[-1] end |
#coverage ⇒ Object
Coverage of a bubble is the coverage of each node in the bubble each weighted by their length
798 799 800 801 802 803 804 805 806 807 |
# File 'lib/assembly/bubbly_assembler.rb', line 798 def coverage sum = 0.0 length = 0 oriented_nodes do |onode| node_length = onode.node.length_alone sum += onode.node.coverage * node_length length += node_length end return sum / length end |
#each_path(options = {}) ⇒ Object
Iterate over the paths returning each as an OrientedNodeTrail. Assumes the path is convergent.
610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 |
# File 'lib/assembly/bubbly_assembler.rb', line 610 def each_path( = {}) raise unless converged? max_cycles = [:max_cycles] || @max_cycles # Metric used to prioritise each_path comparator = lambda do |problem1, problem2| onode1 = nil onode2 = nil if problem1.path.length == 1 and problem2.path.length > 1 # Here the comparison cannot be made on 2nd last node coverages # since one of the paths goes straight from the initial to the terminal # node. Choose instead based on if the second last node has higher or lower # coverage than the final node onode1 = problem1.path[-1] onode2 = problem2.path[-2] elsif problem2.path.length == 1 and problem1.path.length > 1 onode1 = problem1.path[-2] onode2 = problem2.path[-1] else onode1 = problem1.path[-2] onode2 = problem2.path[-2] end #log.debug "Comparing nodes #{onode1.node.node_id} and #{onode2.node.node_id}" if log.debug? if onode1.node.coverage == onode2.node.coverage -(onode1.node.node_id <=> onode2.node.node_id) else onode1.node.coverage <=> onode2.node.coverage end end log.debug "Iterating through each path of bubble" if log.debug? # Tim - use stack and push paths with lowest coverage first stack = DS::Stack.new counter = Bio::AssemblyGraphAlgorithms::SingleCoherentPathsBetweenNodesFinder::CycleCounter.new max_cycles initial_solution = @known_problems[@converging_oriented_node_settable][0] stack.push [initial_solution.path, []] converging_onode = converging_oriented_node #log.debug "Pushed to stack #{initial_solution.path.to_shorthand}" if log.debug? while path_parts = stack.pop direct_node_trail = path_parts[0] second_part = path_parts[1] #log.debug "Popped #{direct_node_trail.to_shorthand} and [#{second_part.collect{|o| o.to_shorthand}.join(',') }]" if log.debug? if direct_node_trail.trail.length == 0 # check for cycles through bubble root if second_part.include? @root #log.debug "Found cycle through bubble root." if log.debug? @circuitous = true unless @circuitous if max_cycles == 0 or max_cycles < counter.path_cycle_count([@root]+second_part) #log.debug "Not finishing cyclic path with too many cycles." if log.debug? next end end yield_path = Bio::Velvet::Graph::OrientedNodeTrail.new yield_path.trail = second_part if @is_reverse yield_path = yield_path.reverse end log.debug "Yielded #{yield_path.to_shorthand}" if log.debug? yield yield_path else # go down the path, looking for other paths head_onode = direct_node_trail.trail[-1] new_second_part = [head_onode]+second_part if second_part.length > 1 and head_onode == converging_oriented_node #log.debug "Ignoring path with cycle through converged node." if log.debug? next end if second_part.include? head_onode #log.debug "Cycle at node #{head_onode.node_id} in path #{second_part.collect{|onode| onode.node.node_id}.join(',')}." if log.debug? @circuitous = true unless @circuitous if max_cycles == 0 or max_cycles < counter.path_cycle_count(new_second_part) #log.debug "Not finishing cyclic path with too many cycles." if log.debug? next end end new_problems = @known_problems[head_onode.to_settable] #log.debug "Found new problems: #{new_problems.collect{|prob| prob.to_shorthand}.join(' ') }" if log.debug? problem_leads = Set.new filtered_problems = new_problems.reject do |new_problem| # Only enqueue paths where the second-to-head onode is not already queued unless new_problem.path.length < 2 lead_settable = new_problem.path[-2].to_settable if problem_leads.include? lead_settable #log.debug "Ignoring duplicate neighbour problem #{new_problem.to_shorthand}" if log.debug? next true end problem_leads << lead_settable end false end filtered_problems.sort(&comparator).each do |new_problem| # TODO: deal with circuits new_trail = Bio::Velvet::Graph::OrientedNodeTrail.new new_trail.trail = new_problem.path[0...-1] #log.debug "Enqueuing #{new_trail.to_shorthand} and [#{new_second_part.collect{|o| o.to_shorthand}.join(',') }]" if log.debug? stack.push [new_trail, new_second_part] end end end end |
#enqueue(dynamic_programming_problem) ⇒ Object
536 537 538 539 540 541 542 543 544 545 546 547 548 |
# File 'lib/assembly/bubbly_assembler.rb', line 536 def enqueue(dynamic_programming_problem) settable = dynamic_programming_problem.to_settable @known_problems[settable] ||= [] @known_problems[settable].push dynamic_programming_problem # don't requeue current problem or circular problem unless dynamic_programming_problem.circular_path_detected == true or @current_problems.include? settable @queue.enqueue dynamic_programming_problem, shortest_problem_distance(dynamic_programming_problem) @current_problems << settable end end |
#num_known_problems ⇒ Object
603 604 605 |
# File 'lib/assembly/bubbly_assembler.rb', line 603 def num_known_problems @known_problems.length end |
#oriented_nodes ⇒ Object
yield or failing that return an Array of the list of oriented_nodes found in at least one path in this (presumed converged) bubble
571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 |
# File 'lib/assembly/bubbly_assembler.rb', line 571 def oriented_nodes raise unless converged? seen_nodes = {} stack = DS::Stack.new initial_solution = @known_problems[@converging_oriented_node_settable][0] converging_onode = initial_solution.path[-1] stack.push converging_onode while onode = stack.pop settable = onode.to_settable next if seen_nodes.key?(settable) if block_given? if @is_reverse yield onode.reverse else yield onode end end seen_nodes[settable] = onode # queue neighbours for paths that don't contain the converging onode @known_problems[settable].each do |dpp| stack.push dpp.path[-2] unless dpp.path.length < 2 or dpp.path[0...-1].include? converging_onode end end return nil if block_given? return seen_nodes.values end |
#paths ⇒ Object
722 723 724 725 726 727 728 |
# File 'lib/assembly/bubbly_assembler.rb', line 722 def paths to_return = [] each_path do |path| to_return.push path end to_return end |
#reference_trail(max_cycles = @max_cycles) ⇒ Object
Return one trail that exemplifies the paths through this bubble. Current method of path selection is simply greedy, taking the highest coverage node at each fork (or failing that the node with the lower node_id).
770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 |
# File 'lib/assembly/bubbly_assembler.rb', line 770 def reference_trail(max_cycles = @max_cycles) raise unless converged? converging_onode = converging_oriented_node log.debug "Finding reference trail from node #{converging_onode.node.node_id}" if log.debug? reference_trail = nil each_path do |path| #break when first path is found reference_trail = path break end return reference_trail end |
#reverse! ⇒ Object
752 753 754 755 |
# File 'lib/assembly/bubbly_assembler.rb', line 752 def reverse! @is_reverse ||= false @is_reverse = !@is_reverse end |
#shift ⇒ Object
Return the next closest dynamic programming problem, removing it from the bubble
507 508 509 510 511 512 513 514 515 |
# File 'lib/assembly/bubbly_assembler.rb', line 507 def shift prob = @queue.shift unless prob.nil? prob.ubiquitous_oriented_nodes = ubiquitous_oriented_nodes(prob) prob.visited_oriented_nodes = visited_oriented_nodes(prob) @current_problems.delete prob.to_settable end return prob end |
#shortest_problem_distance(prob) ⇒ Object
531 532 533 534 |
# File 'lib/assembly/bubbly_assembler.rb', line 531 def shortest_problem_distance(prob) # prioritise by the shortest distance for current problem @known_problems[prob.to_settable].collect{|prob| prob.distance}.min end |
#to_shorthand ⇒ Object
740 741 742 743 744 745 746 747 748 749 750 |
# File 'lib/assembly/bubbly_assembler.rb', line 740 def to_shorthand shorts = [] if converged? shorts = paths.sort{|a,b| a.to_shorthand <=> b.to_shorthand }.collect{|path| path.to_shorthand} else @queue.each do |problem| shorts.push problem.to_shorthand end end return "{#{shorts.join('|') }}" end |
#ubiquitous_oriented_nodes(prob) ⇒ Object
524 525 526 527 528 529 |
# File 'lib/assembly/bubbly_assembler.rb', line 524 def ubiquitous_oriented_nodes(prob) #only ubiquitous nodes from relevant problems @known_problems[prob.to_settable].reduce(prob.ubiquitous_oriented_nodes) do |memo, problem| memo & problem.ubiquitous_oriented_nodes end end |
#visited_oriented_nodes(prob) ⇒ Object
517 518 519 520 521 522 |
# File 'lib/assembly/bubbly_assembler.rb', line 517 def visited_oriented_nodes(prob) #all visited nodes for relevant problems @known_problems[prob.to_settable].reduce(prob.ubiquitous_oriented_nodes) do |memo, problem| memo + problem.ubiquitous_oriented_nodes end end |