bio-velvet_underground

Build Status

This biogem is aimed at providing Ruby bindings to the velvet assembler's source code. See also bio-velvet for Ruby code that does not bind the velvet C.

Installation

gem install bio-velvet_underground

This can take a few minutes as several versions of velvet with different kmer sizes are compiled.

Usage

The code is intended to cater for a few specific purposes.

Running velvet

Running velvet returns a Result object, which is effectively a pointer to a velvet result directory

require 'bio-velvet_underground'

# Run assembly with kmer 29, '-short my.fasta' the arguments to velveth (not including kmer and directory),
# no special arguments given to velvetg.
# A pre-defined velvet result directory:
result = Bio::Velvet::Runner.new.velvet(29, %w(-short my.fasta),[],:output_assembly_path => '/path/to/result')
result.result_directory #=> '/path/to/result'

With the magic of Ruby-FFI, the library with the smallest kmer size >= 29 is chosen (in this case 31). Several libraries are pre-compiled at gem install-time, and then bound at runtime. velveth and velvetg steps can be run separetely if required.

Working with the binary sequence file

The binary sequence file created when velveth is run with the -create_binary flag.

seqs = Bio::Velvet::Underground::BinarySequenceStore.new '/path/to/velvet/directory/CnyUnifiedSeq'
seqs.length #=> 77 (there is 77 sequences in the CnyUnifiedSeq)
seqs[1] #=> 'CACTTATCTCTACCAAAGATCACGATTTAGAATCAAACTATAAAGTTTTAGAAGATAAAGTAACAACTTATACATGGGGA'
seqs[0] #=> nil (indices map directly to the indices in other velvet files)

Working with LastGraph file

path = 'spec/data/3/Assem/LastGraph'
graph = Bio::Velvet::Underground::Graph.parse_from_file path #=> Bio::Velvet::Underground::Graph object

graph.hash_length #=> 31 (kmer length)
graph.node_count #=> 4

graph.nodes[1] #=> Bio::Velvet::Underground::Graph::Node object
graph.nodes[2].ends_of_kmers_of_node #=> 'GTTTAAAAGAAGGAGATTACTTTATAAAA'
graph.nodes[2].coverages #=> [58,0] (coverages from different categories)

graph.nodes[1].short_reads #=> Array of Bio::Velvet::Underground::Graph::NodedRead objects
graph.nodes[1].short_reads[0].direction #=> true (i.e. forward w.r.t the node)
graph.nodes[1].short_reads[2].read_id #=> 4

There are more to these objects - see the documention.

Patches to these and other parts of velvet welcome.

Development practice

The velvet C code 'underground' here is for the most part vanilla velvet code as you might expect. However some changes were necessary to allow binding from this biogem. For instance the library does not write to $stdout as this interferes with Ruby's writes to $stdout.

There are also some extra options for controlling velvet's behaviour, geared towards taking some of the guesswork out of the assembly process at the expense of a less resolved LastGraph. These are currently non-standard modifications - get in touch with @wwood if you are interested. Not invoking these options should leave 'normal' velvet behaviour intact.

Project home page

Information on the source tree, documentation, examples, issues and how to contribute, see

http://github.com/wwood/bioruby-velvet_underground

The BioRuby community is on IRC server: irc.freenode.org, channel: #bioruby.

Cite

This software is currently unpublished.

Biogems.info

This Biogem is published at (http://biogems.info/index.html#bio-velvet_underground)

Copyright (c) 2014 Ben Woodcroft. See LICENSE.txt for further details.