Module: Bio::BioAlignment::TreeSplitter
- Defined in:
- lib/bio-alignment/edit/tree_splitter.rb
Overview
Split an alignment based on its phylogeny
Instance Method Summary collapse
-
#split_on_distance(target_size = nil) ⇒ Object
Split an alignment using a phylogeny tree.
Instance Method Details
#split_on_distance(target_size = nil) ⇒ Object
Split an alignment using a phylogeny tree. One half contains sequences that are relatively homologues, the other half contains the rest. This is described in the tree-split.feature in the features directory.
The target_size parameter gives the size of the homologues sequence set. If target_size is nil, the set will be split in half.
Returns two alignments with their matching trees attached
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
# File 'lib/bio-alignment/edit/tree_splitter.rb', line 15 def split_on_distance target_size = nil target_size = size/2+1 if not target_size aln1 = clone # Start from the root of the tree (FIXME: what if there is no root?) prev_root = nil new_root = aln1.tree.root while new_root # find the nearest child (shortest edge) near_children = new_root.nearest_children # We possibly have multiple matches, so we are going to split on the # number of leafs, or we leave it like it is, if the split will be # too far from the target prev_root = new_root new_root = near_children.first near_children.each do |c| next if c == new_root # find the nearest match if (c.leaves.size-target_size).abs < (new_root.leaves.size-target_size).abs new_root = c end end # Break out of the loop when we hit the target break if new_root.leaves.size <= target_size end # Now see if whether the last step actually was an improvement, otherwise # we take one node up # p [(prev_root.leaves.size-target_size).abs,(new_root.leaves.size-target_size).abs] new_root = prev_root if (prev_root.leaves.size-target_size).abs < (new_root.leaves.size-target_size).abs branch = aln1.tree.clone_subtree(new_root) reduced_tree = aln1.tree.clone_tree_without_branch(new_root) # p branch.map { |n| n.name }.compact # p reduced_tree.map { |n| n.name }.compact # Now reduce the alignments themselves to match the trees aln1 = tree_reduce(reduced_tree) aln2 = tree_reduce(branch) return aln1,aln2 end |