Module: MiGA::Dataset::Base

Included in:
Result
Defined in:
lib/miga/dataset/base.rb

Constant Summary collapse

@@RESULT_DIRS =

Directories containing the results from dataset-specific tasks

{
  # Preprocessing
  raw_reads: '01.raw_reads',
  trimmed_reads: '02.trimmed_reads',
  read_quality: '03.read_quality',
  trimmed_fasta: '04.trimmed_fasta',
  assembly: '05.assembly',
  cds: '06.cds',
  # Annotation
  essential_genes: '07.annotation/01.function/01.essential',
  mytaxa: '07.annotation/02.taxonomy/01.mytaxa',
  mytaxa_scan: '07.annotation/03.qa/02.mytaxa_scan',
  # Distances (for single-species datasets)
  taxonomy: '09.distances/05.taxonomy',
  distances: '09.distances',
  # Post-QC
  ssu: '07.annotation/01.function/02.ssu',
  stats: '90.stats'
}
@@KNOWN_TYPES =

Supported dataset types

{
  genome: {
    description: 'The genome from an isolate',
    multi: false, markers: true,
    project_types: %i[mixed genomes clade]
  },
  scgenome: {
    description: 'A Single-cell Amplified Genome (SAG)',
    multi: false, markers: true,
    project_types: %i[mixed genomes clade]
  },
  popgenome: {
    description: 'A Metagenome-Assembled Genome (MAG)',
    multi: false, markers: true,
    project_types: %i[mixed genomes clade]
  },
  metagenome: {
    description: 'A metagenome (excluding viromes)',
    multi: true, markers: true,
    project_types: %i[mixed metagenomes]
  },
  virome: {
    description: 'A viral metagenome',
    multi: true,
    markers: true, # <- We don't expect, but can be useful for contamination
    project_types: %i[mixed metagenomes]
  },
  plasmid: {
    description: 'An individual plasmid',
    multi: false, markers: false,
    project_types: %i[mixed plasmids]
  }
}
@@PREPROCESSING_TASKS =

Returns an Array of tasks (Symbols) to be executed before project-wide tasks

%i[
  raw_reads trimmed_reads read_quality trimmed_fasta
  assembly cds essential_genes mytaxa mytaxa_scan
  taxonomy distances ssu stats
]
@@EXCLUDE_NOREF_TASKS =

Tasks to be excluded from query datasets

%i[mytaxa_scan taxonomy]
@@_EXCLUDE_NOREF_TASKS_H =
Hash[@@EXCLUDE_NOREF_TASKS.map { |i| [i, true]
@@EXCLUDE_NOMARKER_TASKS =

Tasks to be excluded from datasets without markers

%i[essential_genes ssu]
@@_EXCLUDE_NOMARKER_TASKS_H =
@@ONLY_NONMULTI_TASKS =

Tasks to be executed only in datasets that are single-organism. These tasks are ignored for multi-organism datasets or for unknown types

%i[mytaxa_scan taxonomy distances]
@@_ONLY_NONMULTI_TASKS_H =
@@ONLY_MULTI_TASKS =

Tasks to be executed only in datasets that are multi-organism. These tasks are ignored for single-organism datasets or for unknwon types

%i[mytaxa]
@@_ONLY_MULTI_TASKS_H =
@@OPTIONS =

Options supported by datasets

{
  db_project: {
    desc: 'Project to use as database', type: String
  },
  dist_req: {
    desc: 'Run distances against these datasets', type: Array, default: []
  }
}