Class: MiGA::Dataset
- Includes:
- Result
- Defined in:
- lib/miga/dataset/base.rb,
lib/miga/dataset.rb
Overview
Dataset representation in MiGA.
Defined Under Namespace
Constant Summary
Constants included from MiGA
CITATION, VERSION, VERSION_DATE, VERSION_NAME
Instance Attribute Summary collapse
-
#metadata ⇒ Object
readonly
MiGA::Metadata with information about the dataset.
-
#name ⇒ Object
readonly
Datasets are uniquely identified by
name
in a project. -
#project ⇒ Object
readonly
MiGA::Project that contains the dataset.
Class Method Summary collapse
-
.exist?(project, name) ⇒ Boolean
Does the
project
already have a dataset with thatname
?. -
.INFO_FIELDS ⇒ Object
Standard fields of metadata for datasets.
- .KNOWN_TYPES ⇒ Object
- .PREPROCESSING_TASKS ⇒ Object
- .RESULT_DIRS ⇒ Object
Instance Method Summary collapse
-
#closest_relatives(how_many = 1, ref_project = false) ⇒ Object
Returns an Array of
how_many
duples (Arrays) sorted by AAI: -0
: A String with the name(s) of the reference dataset. -
#ignore_task?(task) ⇒ Boolean
Should I ignore
task
for this dataset?. -
#info ⇒ Object
Get standard metadata values for the dataset as Array.
-
#initialize(project, name, is_ref = true, metadata = {}) ⇒ Dataset
constructor
Create a MiGA::Dataset object in a
project
MiGA::Project with a uniquely identifyingname
. -
#is_multi? ⇒ Boolean
Is this dataset known to be multi-organism?.
-
#is_nonmulti? ⇒ Boolean
Is this dataset known to be single-organism?.
-
#is_query? ⇒ Boolean
Is this dataset a query (non-reference)?.
-
#is_ref? ⇒ Boolean
Is this dataset a reference?.
-
#remove! ⇒ Object
Delete the dataset with all it’s contents (including results) and returns nil.
-
#save ⇒ Object
Save any changes you’ve made in the dataset.
-
#type ⇒ Object
Get the type of dataset as Symbol.
Methods included from Result
#add_result, #cleanup_distances!, #done_preprocessing?, #each_result, #first_preprocessing, #get_result, #next_preprocessing, #profile_advance, #result, #results
Methods inherited from MiGA
CITATION, DEBUG, DEBUG_OFF, DEBUG_ON, DEBUG_TRACE_OFF, DEBUG_TRACE_ON, FULL_VERSION, LONG_VERSION, VERSION, VERSION_DATE, clean_fasta_file, initialized?, #result_files_exist?, root_path, script_path, seqs_length, tabulate
Constructor Details
#initialize(project, name, is_ref = true, metadata = {}) ⇒ Dataset
Create a MiGA::Dataset object in a project
MiGA::Project with a uniquely identifying name
. is_ref
indicates if the dataset is to be treated as reference (true, default) or query (false). Pass any additional metadata
as a Hash.
50 51 52 53 54 55 56 57 58 |
# File 'lib/miga/dataset.rb', line 50 def initialize(project, name, is_ref=true, ={}) raise "Invalid name '#{name}', please use only alphanumerics and " + "underscores." unless name.miga_name? @project = project @name = name [:ref] = is_ref @metadata = MiGA::Metadata.new( File.("metadata/#{name}.json", project.path), ) end |
Instance Attribute Details
#metadata ⇒ Object (readonly)
MiGA::Metadata with information about the dataset.
43 44 45 |
# File 'lib/miga/dataset.rb', line 43 def @metadata end |
#name ⇒ Object (readonly)
Datasets are uniquely identified by name
in a project.
39 40 41 |
# File 'lib/miga/dataset.rb', line 39 def name @name end |
#project ⇒ Object (readonly)
MiGA::Project that contains the dataset.
35 36 37 |
# File 'lib/miga/dataset.rb', line 35 def project @project end |
Class Method Details
.exist?(project, name) ⇒ Boolean
Does the project
already have a dataset with that name
?
19 20 21 |
# File 'lib/miga/dataset.rb', line 19 def exist?(project, name) project.dataset_names.include? name end |
.INFO_FIELDS ⇒ Object
Standard fields of metadata for datasets.
25 26 27 |
# File 'lib/miga/dataset.rb', line 25 def INFO_FIELDS %w(name created updated type ref user description comments) end |
.KNOWN_TYPES ⇒ Object
9 |
# File 'lib/miga/dataset/base.rb', line 9 def KNOWN_TYPES ; @@KNOWN_TYPES ; end |
.PREPROCESSING_TASKS ⇒ Object
10 |
# File 'lib/miga/dataset/base.rb', line 10 def PREPROCESSING_TASKS ; @@PREPROCESSING_TASKS ; end |
.RESULT_DIRS ⇒ Object
8 |
# File 'lib/miga/dataset/base.rb', line 8 def RESULT_DIRS ; @@RESULT_DIRS ; end |
Instance Method Details
#closest_relatives(how_many = 1, ref_project = false) ⇒ Object
Returns an Array of how_many
duples (Arrays) sorted by AAI:
-
0
: A String with the name(s) of the reference dataset. -
1
: A Float with the AAI.
This function is currently only supported for query datasets when ref_project
is false (default), and only for reference dataset when ref_project
is true. It returns nil
if this analysis is not supported.
130 131 132 133 134 135 136 137 |
# File 'lib/miga/dataset.rb', line 130 def closest_relatives(how_many=1, ref_project=false) return nil if (is_ref? != ref_project) or is_multi? r = result(ref_project ? :taxonomy : :distances) return nil if r.nil? db = SQLite3::Database.new(r.file_path :aai_db) db.execute("SELECT seq2, aai FROM aai WHERE seq2 != ? " + "GROUP BY seq2 ORDER BY aai DESC LIMIT ?", [name, how_many]) end |
#ignore_task?(task) ⇒ Boolean
Should I ignore task
for this dataset?
114 115 116 117 118 119 120 121 |
# File 'lib/miga/dataset.rb', line 114 def ignore_task?(task) return !["run_#{task}"] unless ["run_#{task}"].nil? return true if task==:taxonomy and project.[:ref_project].nil? pattern = [true, false] ( [@@_EXCLUDE_NOREF_TASKS_H[task], is_ref? ]==pattern or [@@_ONLY_MULTI_TASKS_H[task], is_multi? ]==pattern or [@@_ONLY_NONMULTI_TASKS_H[task], is_nonmulti?]==pattern ) end |
#info ⇒ Object
Get standard metadata values for the dataset as Array.
82 83 84 85 86 |
# File 'lib/miga/dataset.rb', line 82 def info MiGA::Dataset.INFO_FIELDS.map do |k| (k=="name") ? self.name : [k.to_sym] end end |
#is_multi? ⇒ Boolean
Is this dataset known to be multi-organism?
98 99 100 101 102 |
# File 'lib/miga/dataset.rb', line 98 def is_multi? return false if [:type].nil? or @@KNOWN_TYPES[type].nil? @@KNOWN_TYPES[type][:multi] end |
#is_nonmulti? ⇒ Boolean
Is this dataset known to be single-organism?
106 107 108 109 110 |
# File 'lib/miga/dataset.rb', line 106 def is_nonmulti? return false if [:type].nil? or @@KNOWN_TYPES[type].nil? !@@KNOWN_TYPES[type][:multi] end |
#is_query? ⇒ Boolean
Is this dataset a query (non-reference)?
94 |
# File 'lib/miga/dataset.rb', line 94 def is_query? ; ![:ref] ; end |
#is_ref? ⇒ Boolean
Is this dataset a reference?
90 |
# File 'lib/miga/dataset.rb', line 90 def is_ref? ; !![:ref] ; end |
#remove! ⇒ Object
Delete the dataset with all it’s contents (including results) and returns nil.
75 76 77 78 |
# File 'lib/miga/dataset.rb', line 75 def remove! self.results.each{ |r| r.remove! } self..remove! end |
#save ⇒ Object
Save any changes you’ve made in the dataset.
62 63 64 65 66 |
# File 'lib/miga/dataset.rb', line 62 def save self.[:type] = :metagenome if ![:tax].nil? and ![:tax][:ns].nil? and [:tax][:ns]=="COMMUNITY" self..save end |
#type ⇒ Object
Get the type of dataset as Symbol.
70 |
# File 'lib/miga/dataset.rb', line 70 def type ; [:type] ; end |