Class: Unipept::CSVFormatter

Inherits:

Formatter

Object
Formatter
Unipept::CSVFormatter

show all

Defined in:: lib/formatters.rb

Instance Method Summary collapse

#convert(data, _first) ⇒ String

Converts the given input data to the CSV format.
#footer ⇒ Object
#get_keys(data, fasta_mapper = nil) ⇒ Object
#header(data, fasta_mapper = nil) ⇒ String

Returns the header row for the given data and fasta_mapper.
#type ⇒ String

The type of the current formatter: csv.

Methods inherited from Formatter

available, default, #format, formatters, #group_by_first_key, hidden?, #integrate_fasta_headers, new_for_format, register

Instance Method Details

#convert(data, _first) ⇒ `String`

Converts the given input data to the CSV format.

# File 'lib/formatters.rb', line 237

def convert(data, _first)
  keys = get_keys(data)

  CSV.generate do |csv|
    data.each do |o|
      row = {}
      o.each do |k, v|
        if %w[ec go ipr].include? k
          if v && !v.empty?
            v.first.each_key do |key|
              row[key == 'protein_count' ? "#{k}_protein_count" : key] = (v.map { |el| el[key] }).join(' ').strip
            end
          else
            row[k] = row.concat(Array.new($keys_length[0], nil)) # rubocop:disable Style/GlobalVars
          end
        else
          row[k] = (v == '' ? nil : v)
        end
      end
      csv << keys.map { |k| row[k] }
    end
  end
end



226
227
228

# File 'lib/formatters.rb', line 226

def footer
  ''
end

#get_keys(data, fasta_mapper = nil) ⇒ `Object`

# File 'lib/formatters.rb', line 172

def get_keys(data, fasta_mapper = nil)
  # This global variable is necessary because we need to know how many items should be
  # nil in the convert function.
  $keys_length = 0 # rubocop:disable Style/GlobalVars
  # This array keeps track of items that are certainly filled in for each type of annotation
  non_empty_items = { 'ec' => nil, 'go' => nil, 'ipr' => nil }

  # First we look for items for both ec numbers, go terms and ipr codes that are fully filled in.
  data.each do |row|
    non_empty_items.each_key do |annotation_type|
      non_empty_items[annotation_type] = row if row[annotation_type] && !row[annotation_type].empty?
    end
  end

  keys = fasta_mapper ? ['fasta_header'] : []
  keys += (data.first.keys - %w[ec go ipr])
  processed_keys = keys

  non_empty_items.each do |annotation_type, non_empty_item|
    next unless non_empty_item

    keys += (non_empty_item.keys - processed_keys)
    processed_keys += non_empty_item.keys

    idx = keys.index(annotation_type)
    keys.delete_at(idx)
    keys.insert(idx, *non_empty_item[annotation_type].first.keys.map { |el| %w[ec_number go_term ipr_code].include?(el) ? el : "#{annotation_type}_#{el}" })
    $keys_length = *non_empty_item[annotation_type].first.keys.length # rubocop:disable Style/GlobalVars
  end

  keys
end

#header(data, fasta_mapper = nil) ⇒ `String`

Returns the header row for the given data and fasta_mapper. This row contains all the keys of the first element of the data, preceded by ‘fasta_header’ if a fasta_mapper is given.

data and corresponding fasta header. The data is represented as a list containing tuples where the first element is the fasta header and second element is the input data If a fasta_mapper is given, the output will be preceded with ‘fasta_header’.

# File 'lib/formatters.rb', line 218

def header(data, fasta_mapper = nil)
  keys = get_keys(data, fasta_mapper)

  CSV.generate do |csv|
    csv << keys.map(&:to_s) if keys.length.positive?
  end
end

#type ⇒ `String`



168
169
170

# File 'lib/formatters.rb', line 168

def type
  'csv'
end

Class: Unipept::CSVFormatter

Instance Method Summary collapse

Methods inherited from Formatter

Instance Method Details

#convert(data, _first) ⇒ String

#footer ⇒ Object

#get_keys(data, fasta_mapper = nil) ⇒ Object

#header(data, fasta_mapper = nil) ⇒ String

#type ⇒ String

#convert(data, _first) ⇒ `String`

#footer ⇒ `Object`

#get_keys(data, fasta_mapper = nil) ⇒ `Object`

#header(data, fasta_mapper = nil) ⇒ `String`

#type ⇒ `String`