Class: Bio::DB::Fasta::FastaFile

Inherits:
Object
  • Object
show all
Defined in:
lib/bio/db/fastadb.rb

Overview

Class that holds the fasta file. It is used as a database.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(args) ⇒ FastaFile

Initialize the fasta file. If the fai file doesn’t exists, it is generated at startup

  • fasta path to the fasta file

  • samtools path to samtools, if it is not provided, use the bundled version

Raises:



177
178
179
180
181
182
183
184
185
186
187
188
# File 'lib/bio/db/fastadb.rb', line 177

def initialize(args)
  @fasta_path = args[:fasta]
  @samtools = args[:samtools] || File.join(File.expand_path(File.dirname(__FILE__)),'sam','external','samtools')
  raise FastaDBException.new(), "No path for the refernce fasta file. " if @fasta_path.nil?
  @fai_file = @fasta_path + ".fai" 
  unless File.file?(@fai_file) then
    command = "#{@samtools} faidx '#{@fasta_path}'"
    @last_command = command
    system(command)
  end

end

Instance Attribute Details

#fasta_pathObject (readonly)

Returns the value of attribute fasta_path.



172
173
174
# File 'lib/bio/db/fastadb.rb', line 172

def fasta_path
  @fasta_path
end

#indexObject (readonly)

Returns the value of attribute index.



172
173
174
# File 'lib/bio/db/fastadb.rb', line 172

def index
  @index
end

Instance Method Details

#faidx(opts = {}) ⇒ Object

Index reference sequence in the FASTA format or extract subsequence from indexed reference sequence. If no region is specified, faidx will index the file and create <ref.fasta>.fai on the disk. If regions are speficified, the subsequences will be retrieved and printed to stdout in the FASTA format. Options - if a subsequence is required

  • chr - [STRING] the reference name of the subsequence

  • start - [INT] the start position for the subsequence

  • stop - [INT] the stop position for the subsequence



209
210
211
212
213
214
215
216
217
218
# File 'lib/bio/db/fastadb.rb', line 209

def faidx(opts={})
  if opts.has_key?(:chr) and opts.has_key?(:start) and opts.has_key?(:stop)
    opts={:as_bio => false}
    self.fetch_reference(:chr,:start,:stop,opts)
  else
    command = "#{@samtools} faidx #{@fasta_path}"
    @last_command = command
    system(command)
  end
end

#fetch_sequence(region) ⇒ Object

The region needs to have a method to_region or a method to_s that ha the format “chromosome:start-end” as in samtools



222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
# File 'lib/bio/db/fastadb.rb', line 222

def fetch_sequence(region)
  query = region.to_s
  query = region.to_region.to_s if region.respond_to?(:to_region) 
  command = "#{@samtools} faidx #{@fasta_path} '#{query}'"
  puts command  if $VERBOSE
  @last_command = command
  seq = ""
  yield_from_pipe(command, String, :text ) {|line| seq = seq + line unless line =~ /^>/}

  reference = Bio::Sequence::NA.new(seq)

  if region.orientation == :reverse
    #puts "reversing! #{reference.to_s}"
    reference.reverse_complement!()
  end
  reference
end

#load_fai_entriesObject

Loads the fai entries



191
192
193
194
195
196
197
198
199
200
# File 'lib/bio/db/fastadb.rb', line 191

def load_fai_entries()
  return  @index.length if @index
  @index = Index.new
  fai_file = @fai_file
  File.open(fai_file).each do | line |
    fields = line.split("\t")
    @index << Entry.new(fields[0], fields[1])
  end     
  @index.length
end