Class: ParallelParser

Inherits:
Object
  • Object
show all
Defined in:
lib/biodiversity/parser.rb

Overview

Public: Parser which runs in parallel.

Examples

parser = ParallelParser.new(4) parser.parse([‘Betula L.’, ‘Pardosa moesta’])

Instance Method Summary collapse

Constructor Details

#initialize(processes_num = nil) ⇒ ParallelParser

Public: Initialize ParallelParser.

processes_num - an Integer to setup the number of processes (default: nil).

If processes number is not set it will be determined
automatically.


50
51
52
53
54
55
56
57
58
# File 'lib/biodiversity/parser.rb', line 50

def initialize(processes_num = nil)
  require 'parallel'
  cpu_num
  if processes_num.to_i > 0
    @processes_num = [processes_num, cpu_num - 1].min
  else
    @processes_num = cpu_num > 3 ? cpu_num - 2 : 1
  end
end

Instance Method Details

#cpu_numObject

Public: Returns the number of cores/CPUs.

Returns Integer of cores/CPUs.



86
87
88
# File 'lib/biodiversity/parser.rb', line 86

def cpu_num
  @cpu_num ||= Parallel.processor_count
end

#parse(names_list) ⇒ Object

Public: Parses an array of scientific names using several processes in parallel.

Scientific names are deduplicated in the process, so every string is parsed only once.

names_list - takes an Array of scientific names,

each element should be a String.

Examples

parser = ParallelParser.new(4) parser.parse([‘Homo sapiens L.’, ‘Quercus quercus’])

Returns a Hash with scientific names as a key, and parsing results as a value.



76
77
78
79
80
81
# File 'lib/biodiversity/parser.rb', line 76

def parse(names_list)
  parsed = Parallel.map(names_list.uniq, in_processes: @processes_num) do |n|
    [n, parse_process(n)]
  end
  parsed.inject({}) { |res, x| res[x[0]] = x[1]; res }
end