Class: Opener::LanguageIdentifier

Inherits:
Object
  • Object
show all
Defined in:
lib/opener/language_identifier.rb,
lib/opener/language_identifier/cli.rb,
lib/opener/language_identifier/server.rb,
lib/opener/language_identifier/version.rb,
lib/opener/language_identifier/detector.rb,
lib/opener/language_identifier/kaf_builder.rb,
lib/opener/language_identifier/backend/opennlp.rb,
lib/opener/language_identifier/backend/language_detection.rb,
lib/opener/language_identifier/backend/detect_language_com.rb

Overview

Language identifier class that can detect various languages such as Dutch, German and Swedish.

Defined Under Namespace

Modules: Backend Classes: CLI, Detector, KafBuilder, Server

Constant Summary collapse

DEFAULT_OPTIONS =

Hash containing the default options to use.

Returns:

  • (Hash)
{
  args:  [],
  kaf:   true,
}.freeze
VERSION =
'4.4.3'

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(options = {}) ⇒ LanguageIdentifier

Returns a new instance of LanguageIdentifier.

Parameters:

  • options (Hash) (defaults to: {})

Options Hash (options):

  • :args (Array)

    Arbitrary arguments to pass to the underlying kernel.

  • :kaf (TrueClass|FalseClass)

    When set to ‘true` the results will be displayed as KAF.



48
49
50
51
# File 'lib/opener/language_identifier.rb', line 48

def initialize(options = {})
  @options  = DEFAULT_OPTIONS.merge(options)
  @detector = Detector.new ENV['BACKEND'], ENV['FALLBACK']
end

Instance Attribute Details

#optionsHash (readonly)

Returns:

  • (Hash)


26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
# File 'lib/opener/language_identifier.rb', line 26

class LanguageIdentifier
  attr_reader :options

  ##
  # Hash containing the default options to use.
  #
  # @return [Hash]
  #
  DEFAULT_OPTIONS = {
    args:  [],
    kaf:   true,
  }.freeze

  ##
  # @param [Hash] options
  #
  # @option options [Array] :args Arbitrary arguments to pass to the
  #  underlying kernel.
  #
  # @option options [TrueClass|FalseClass] :kaf When set to `true` the
  #  results will be displayed as KAF.
  #
  def initialize(options = {})
    @options  = DEFAULT_OPTIONS.merge(options)
    @detector = Detector.new ENV['BACKEND'], ENV['FALLBACK']
  end

  ##
  # Processes the input and returns an Array containing the output of STDOUT,
  # STDERR and an object containing process information.
  #
  # @param [String] input The text of which to detect the language.
  # @return [Array]
  #
  def run input, params = {}
    lang   = params[:language] # already provided, skip detection
    lang ||= @detector.detect input
    lang = build_kaf input, lang if options[:kaf]
    lang
  end

  alias identify run

  protected

  ##
  # Builds a KAF document containing the input and the correct XML language
  # tag based on the output of the kernel.
  #
  # @param [String] input The input text.
  # @param [String] language The detected language
  # @return [String]
  #
  def build_kaf(input, language)
    builder = KafBuilder.new(input, language)
    builder.build

    return builder.to_s
  end
end

Instance Method Details

#run(input, params = {}) ⇒ Array Also known as: identify

Processes the input and returns an Array containing the output of STDOUT, STDERR and an object containing process information.

Parameters:

  • input (String)

    The text of which to detect the language.

Returns:

  • (Array)


60
61
62
63
64
65
# File 'lib/opener/language_identifier.rb', line 60

def run input, params = {}
  lang   = params[:language] # already provided, skip detection
  lang ||= @detector.detect input
  lang = build_kaf input, lang if options[:kaf]
  lang
end