Class: Bio::MAF::Access

Inherits:
Object
  • Object
show all
Defined in:
lib/bio/maf/index.rb

Overview

Top-level class for working with a set of indexed MAF files. Provides a higher-level alternative to working with Parser and KyotoIndex objects directly.

Instantiate with Access.maf_dir and Access.file methods.

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(options) ⇒ Access

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Returns a new instance of Access.



187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
# File 'lib/bio/maf/index.rb', line 187

def initialize(options)
  @parse_options = options
  @indices = {}
  @maf_by_chrom = {}
  if options[:dir]
    @dir = options[:dir]
    @maf_files = Dir.glob("#{@dir}/*.maf")
  elsif options[:maf]
    @maf_files = [options[:maf]]
    if options[:index]
      register_index(KyotoIndex.open(options[:index]),
                     options[:maf])
    end
  else
    raise "Must specify :dir or :maf!"
  end
  scan_indices!
  if options[:maf] && @indices.empty?
    # MAF file explicitly given but no index
    # build a temporary one
    # (could build a real one, too...)
    maf = options[:maf]
    parser = Parser.new(maf, @parse_options)
    LOG.warn { "WARNING: building temporary index on #{maf}." }
    index = KyotoIndex.build(parser, '%')
    register_index(index, maf)
  end
end

Instance Attribute Details

#block_filterHash

Block filter to apply.

Returns:

  • (Hash)

See Also:



82
83
84
# File 'lib/bio/maf/index.rb', line 82

def block_filter
  @block_filter
end

#indicesObject (readonly)

Returns the value of attribute indices.



83
84
85
# File 'lib/bio/maf/index.rb', line 83

def indices
  @indices
end

#parse_optionsHash

Parser options.

Returns:

  • (Hash)

See Also:



74
75
76
# File 'lib/bio/maf/index.rb', line 74

def parse_options
  @parse_options
end

#sequence_filterHash

Sequence filter to apply.

Returns:

  • (Hash)

See Also:



78
79
80
# File 'lib/bio/maf/index.rb', line 78

def sequence_filter
  @sequence_filter
end

Class Method Details

.file(maf, index = nil, options = {}) ⇒ Access

Provides access to a single MAF file. If this file is not indexed, it will be fully parsed to create a temporary in-memory index. For large MAF files or ones which will be used multiple times, this is inefficient, and an index file should be created with maf_index(1).

Parameters:

  • maf (String)

    path to MAF file

  • index (String) (defaults to: nil)

    Kyoto Cabinet index file

  • options (Hash) (defaults to: {})

    parser options

Returns:



107
108
109
110
111
112
# File 'lib/bio/maf/index.rb', line 107

def self.file(maf, index=nil, options={})
  o = options.dup
  o[:maf] = maf
  o[:index] = index if index
  self.new(o)
end

.maf_dir(dir, options = {}) ⇒ Access

Provides access to a directory of indexed MAF files. Any files with .maf suffixes and accompanying .kct indexes in the given directory will be accessible.

Parameters:

  • dir (String)

    directory to scan

  • options (Hash) (defaults to: {})

    parser options

Returns:



91
92
93
94
95
# File 'lib/bio/maf/index.rb', line 91

def self.maf_dir(dir, options={})
  o = options.dup
  o[:dir] = dir
  self.new(o)
end

Instance Method Details

#chrom_index(chrom) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



241
242
243
244
245
246
# File 'lib/bio/maf/index.rb', line 241

def chrom_index(chrom)
  unless @indices.has_key? chrom
    raise "No index available for chromosome #{chrom}!"
  end
  @indices[chrom]
end

#closeObject

Close all open resources, in particular Kyoto Cabinet database handles.



116
117
118
# File 'lib/bio/maf/index.rb', line 116

def close
  @indices.values.each { |ki| ki.close }
end

#find(intervals) {|block| ... } ⇒ Enumerable<Block>

Find all alignment blocks in the genomic regions in the list of Bio::GenomicInterval objects, and parse them with the given parser.

Parameters:

Yields:

  • (block)

    each Block matched, in turn

Returns:

  • (Enumerable<Block>)

    each matching Block, if no block given

See Also:



130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
# File 'lib/bio/maf/index.rb', line 130

def find(intervals, &blk)
  if block_given?
    by_chrom = intervals.group_by { |i| i.chrom }
    by_chrom.keys.each do |chrom|
      unless @indices.has_key? chrom
        raise "No index available for chromosome #{chrom}!"
      end
    end
    by_chrom.each do |chrom, c_intervals|
      index = @indices[chrom]
      with_parser(chrom) do |parser|
        index.find(c_intervals, parser, block_filter, &blk)
      end
    end
  else
    enum_for(:find, intervals)
  end
end

#find_index_file(maf) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



217
218
219
220
221
# File 'lib/bio/maf/index.rb', line 217

def find_index_file(maf)
  base = File.basename(maf, '.maf')
  index_f = "#{@dir}/#{base}.kct"
  File.exists?(index_f) ? index_f : nil
end

#register_index(index, maf) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



224
225
226
227
# File 'lib/bio/maf/index.rb', line 224

def register_index(index, maf)
  @indices[index.ref_seq] = index
  @maf_by_chrom[index.ref_seq] = maf
end

#scan_indices!Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



230
231
232
233
234
235
236
237
238
# File 'lib/bio/maf/index.rb', line 230

def scan_indices!
  @maf_files.each do |maf|
    index_f = find_index_file(maf)
    if index_f
      index = KyotoIndex.open(index_f)
      register_index(index, maf)
    end
  end
end

#slice(interval) {|block| ... } ⇒ Enumerable<Block>

Find and parse all alignment blocks in the genomic region given by a Bio::GenomicInterval, and truncate them to just the region intersecting that interval.

Parameters:

Yields:

  • (block)

    each Block matched, in turn

Returns:

  • (Enumerable<Block>)

    each matching Block, if no block given

See Also:



177
178
179
180
181
182
# File 'lib/bio/maf/index.rb', line 177

def slice(interval, &blk)
  index = chrom_index(interval.chrom)
  with_parser(interval.chrom) do |parser|
    index.slice(interval, parser, &blk)
  end
end

#tile(interval) {|tiler| ... } ⇒ Object

Find and parse all alignment blocks in the genomic region given by a Bio::GenomicInterval, and combine them to synthesize a single alignment covering that interval exactly.

Parameters:

Yields:

  • (tiler)

    a Tiler ready to operate on the given interval



157
158
159
160
161
162
163
164
165
166
# File 'lib/bio/maf/index.rb', line 157

def tile(interval)
  index = chrom_index(interval.chrom)
  with_parser(interval.chrom) do |parser|
    tiler = Tiler.new
    tiler.index = index
    tiler.parser = parser
    tiler.interval = interval
    yield tiler
  end
end

#with_parser(chrom) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



249
250
251
252
253
254
255
256
257
258
# File 'lib/bio/maf/index.rb', line 249

def with_parser(chrom)
  LOG.debug { "Creating parser with options #{@parse_options.inspect}" }
  parser = Parser.new(@maf_by_chrom[chrom], @parse_options)
  parser.sequence_filter = self.sequence_filter
  begin
    yield parser
  ensure
    parser.close
  end
end