Class: Bio::MAF::Access

Inherits:
Object
  • Object
show all
Defined in:
lib/bio/maf/index.rb

Overview

Top-level class for working with a set of indexed MAF files. Provides a higher-level alternative to working with Parser and KyotoIndex objects directly.

Instantiate with Access.maf_dir and Access.file methods.

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(options) ⇒ Access

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Returns a new instance of Access.



194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
# File 'lib/bio/maf/index.rb', line 194

def initialize(options)
  @parse_options = options
  @indices = {}
  @maf_by_chrom = {}
  if options[:dir]
    scan_dir(options[:dir])
  elsif options[:maf]
    if options[:index]
      LOG.debug { "Opening index file #{options[:index]}" }
      index = KyotoIndex.open(options[:index])
      register_index(index,
                     options[:maf])
      index.close
    else
      idx_f = find_index_file(options[:maf])
      if idx_f
        index = KyotoIndex.open(idx_f)
        register_index(index, options[:maf])
        index.close
      end
    end
  else
    raise "Must specify :dir or :maf!"
  end
  if options[:maf] && @indices.empty?
    # MAF file explicitly given but no index
    # build a temporary one
    # (could build a real one, too...)
    maf = options[:maf]
    parser = Parser.new(maf, @parse_options)
    LOG.warn { "WARNING: building temporary index on #{maf}." }
    index = KyotoIndex.build(parser, '%')
    register_index(index, maf)
  end
end

Instance Attribute Details

#block_filterHash

Block filter to apply.

Returns:

  • (Hash)

See Also:



83
84
85
# File 'lib/bio/maf/index.rb', line 83

def block_filter
  @block_filter
end

#indicesObject (readonly)

Returns the value of attribute indices.



84
85
86
# File 'lib/bio/maf/index.rb', line 84

def indices
  @indices
end

#parse_optionsHash

Parser options.

Returns:

  • (Hash)

See Also:



75
76
77
# File 'lib/bio/maf/index.rb', line 75

def parse_options
  @parse_options
end

#sequence_filterHash

Sequence filter to apply.

Returns:

  • (Hash)

See Also:



79
80
81
# File 'lib/bio/maf/index.rb', line 79

def sequence_filter
  @sequence_filter
end

Class Method Details

.file(maf, index = nil, options = {}) ⇒ Access

Provides access to a single MAF file. If this file is not indexed, it will be fully parsed to create a temporary in-memory index. For large MAF files or ones which will be used multiple times, this is inefficient, and an index file should be created with maf_index(1).

Parameters:

  • maf (String)

    path to MAF file

  • index (String) (defaults to: nil)

    Kyoto Cabinet index file

  • options (Hash) (defaults to: {})

    parser options

Returns:



108
109
110
111
112
113
# File 'lib/bio/maf/index.rb', line 108

def self.file(maf, index=nil, options={})
  o = options.dup
  o[:maf] = maf
  o[:index] = index if index
  self.new(o)
end

.maf_dir(dir, options = {}) ⇒ Access

Provides access to a directory of indexed MAF files. Any files with .maf suffixes and accompanying .kct indexes in the given directory will be accessible.

Parameters:

  • dir (String)

    directory to scan

  • options (Hash) (defaults to: {})

    parser options

Returns:



92
93
94
95
96
# File 'lib/bio/maf/index.rb', line 92

def self.maf_dir(dir, options={})
  o = options.dup
  o[:dir] = dir
  self.new(o)
end

Instance Method Details

#chrom_index(chrom) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



264
265
266
267
268
269
270
271
272
273
274
275
# File 'lib/bio/maf/index.rb', line 264

def chrom_index(chrom)
  unless @indices.has_key? chrom
    raise "No index available for chromosome #{chrom}!"
  end
  index = @indices[chrom]
  if index.is_a? KyotoIndex
    # temporary
    index
  else
    KyotoIndex.open(index)
  end
end

#closeObject

Close all open resources, in particular Kyoto Cabinet database handles.



117
118
119
# File 'lib/bio/maf/index.rb', line 117

def close
  @indices.values.each { |ki| ki.close }
end

#find(intervals) {|block| ... } ⇒ Array<Block>

Find all alignment blocks in the genomic regions in the list of Bio::GenomicInterval objects, and parse them with the given parser.

Parameters:

Yields:

  • (block)

    each Block matched, in turn

Returns:

  • (Array<Block>)

    each matching Block, if no block given

See Also:



131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
# File 'lib/bio/maf/index.rb', line 131

def find(intervals, &blk)
  if block_given?
    by_chrom = intervals.group_by { |i| i.chrom }
    by_chrom.keys.each do |chrom|
      unless @indices.has_key? chrom
        raise "No index available for chromosome #{chrom}!"
      end
    end
    by_chrom.each do |chrom, c_intervals|
      with_index(chrom) do |index|
        with_parser(chrom) do |parser|
          index.find(c_intervals, parser, block_filter, &blk)
        end
      end
    end
  else
    acc = []
    self.find(intervals) { |block| acc << block }
    acc
  end
end

#find_index_file(maf) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



231
232
233
234
235
236
# File 'lib/bio/maf/index.rb', line 231

def find_index_file(maf)
  dir = File.dirname(maf)
  base = File.basename(maf)
  noext = base.gsub(/\.maf.*/, '')
  idx = [base, noext].collect { |n| "#{dir}/#{n}.kct" }.find { |path| File.exist? path }
end

#register_index(index, maf) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



239
240
241
242
243
244
245
246
247
248
249
# File 'lib/bio/maf/index.rb', line 239

def register_index(index, maf)
  unless index.maf_file == File.basename(maf)
    raise "Index #{index.path} was created for #{index.maf_file}, not #{File.basename(maf)}!"
  end
  if index.path.to_s.start_with? '%'
    @indices[index.ref_seq] = index
  else
    @indices[index.ref_seq] = index.path.to_s
  end
  @maf_by_chrom[index.ref_seq] = maf
end

#scan_dir(dir) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



252
253
254
255
256
257
258
259
260
261
# File 'lib/bio/maf/index.rb', line 252

def scan_dir(dir)
  Dir.glob("#{dir}/*.kct").each do |index_f|
    index = KyotoIndex.open(index_f)
    maf = "#{dir}/#{index.maf_file}"
    if File.exist? maf
      register_index(index, maf)
    end
    index.close
  end
end

#slice(interval) {|block| ... } ⇒ Array<Block>

Find and parse all alignment blocks in the genomic region given by a Bio::GenomicInterval, and truncate them to just the region intersecting that interval.

Parameters:

Yields:

  • (block)

    each Block matched, in turn

Returns:

  • (Array<Block>)

    each matching Block, if no block given

See Also:



182
183
184
185
186
187
188
189
# File 'lib/bio/maf/index.rb', line 182

def slice(interval, &blk)
  with_index(interval.chrom) do |index|
    with_parser(interval.chrom) do |parser|
      s = index.slice(interval, parser, block_filter, &blk)
      block_given? ? s : s.to_a
    end
  end
end

#tile(interval) {|tiler| ... } ⇒ Object

Find and parse all alignment blocks in the genomic region given by a Bio::GenomicInterval, and combine them to synthesize a single alignment covering that interval exactly.

Parameters:

Yields:

  • (tiler)

    a Tiler ready to operate on the given interval



161
162
163
164
165
166
167
168
169
170
171
# File 'lib/bio/maf/index.rb', line 161

def tile(interval)
  with_index(interval.chrom) do |index|
    with_parser(interval.chrom) do |parser|
      tiler = Tiler.new
      tiler.index = index
      tiler.parser = parser
      tiler.interval = interval
      yield tiler
    end
  end
end

#with_index(chrom) ⇒ Object



277
278
279
280
281
282
283
284
285
# File 'lib/bio/maf/index.rb', line 277

def with_index(chrom)
  index = chrom_index(chrom)
  LOG.debug { "Selected index #{index} for sequence #{chrom}." }
  begin
    yield index
  ensure
    index.close unless index.path.to_s.start_with? '%'
  end
end

#with_parser(chrom) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



288
289
290
291
292
293
294
295
296
297
# File 'lib/bio/maf/index.rb', line 288

def with_parser(chrom)
  LOG.debug { "Creating parser with options #{@parse_options.inspect}" }
  parser = Parser.new(@maf_by_chrom[chrom], @parse_options)
  parser.sequence_filter = self.sequence_filter
  begin
    yield parser
  ensure
    parser.close
  end
end