Class: Abner

Inherits:
NER
  • Object
show all
Defined in:
lib/rbbt/ner/abner.rb

Overview

Offers a Ruby interface to the Abner Named Entity Recognition Package in Java Abner.

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from NER

#entities

Constructor Details

#initialize(modelfile = nil) ⇒ Abner

If modelfile is present a custom trained model can be used, otherwise, the default BioCreative model is used.



21
22
23
24
25
26
27
28
# File 'lib/rbbt/ner/abner.rb', line 21

def initialize(modelfile=nil)
  Abner.init
  if modelfile == nil         
    @tagger = @@Tagger.new(@@Tagger.BIOCREATIVE)
  else                
    @tagger = @@Tagger.new(@@JFile.new(modelfile))
  end
end

Class Method Details

.initObject



13
14
15
16
17
# File 'lib/rbbt/ner/abner.rb', line 13

def self.init
  @@JFile   ||= Rjb::import('java.io.File')
  @@Tagger  ||= Rjb::import('abner.Tagger')
  @@Trainer ||= Rjb::import('abner.Trainer')
end

Instance Method Details

#match(text) ⇒ Object

Given a chunk of text, it finds all the mentions appearing in it. It returns all the mentions found, regardless of type, to be coherent with the rest of NER packages in Rbbt.



33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
# File 'lib/rbbt/ner/abner.rb', line 33

def match(text)
  return [] if text.nil? or text.empty?

  text = text.encode('utf-8', 'binary', :invalid => :replace, :undef => :replace, :replace => '')
  res = @tagger.getEntities(text)
  types = res[1]
  strings = res[0]

  global_offset = 0
  strings.zip(types).collect do |mention, type| 
    mention = mention.to_s; 
    offset = text.index(mention)
    if offset.nil?
      NamedEntity.setup(mention, nil, type.to_s)
    else
      NamedEntity.setup(mention, offset + global_offset, type.to_s)
      text = text[offset + mention.length..-1]
      global_offset += offset + mention.length
    end

    mention
  end
end