Class: Spacy::Matcher

Inherits:

Object

Object
Spacy::Matcher

show all

Defined in:: lib/ruby-spacy.rb

Overview

See also spaCy Python API document for Matcher.

Instance Attribute Summary collapse

#py_matcher ⇒ Object readonly
A Python Matcher instance accessible via PyCall.
#spacy_matcher_id ⇒ String readonly
An identifier string that can be used when referring to the Python object inside PyCall::exec or PyCall::eval.

Instance Method Summary collapse

#add(text, pattern) ⇒ Object
Adds a label string and a text pattern.
#initialize(nlp_id) ⇒ Matcher constructor
Creates a Matcher instance.
#match(doc) ⇒ Array<Hash{:match_id => Integer, :start_index => Integer, :end_index => Integer}>
Execute the match.

Constructor Details

#initialize(nlp_id) ⇒ `Matcher`

Creates a Spacy::Matcher instance

Parameters:

nlp_id (String) —
The id string of the nlp, an instance of Language class

# File 'lib/ruby-spacy.rb', line 446

def initialize(nlp_id)
  @spacy_matcher_id = "doc_#{nlp_id}_matcher"
  PyCall.exec("#{@spacy_matcher_id} = Matcher(#{nlp_id}.vocab)")
  @py_matcher = PyCall.eval(@spacy_matcher_id)
end

Instance Attribute Details

#py_matcher ⇒ `Object` (readonly)

Returns a Python Matcher instance accessible via PyCall.

Returns:

(Object) —
a Python Matcher instance accessible via PyCall



442
443
444

# File 'lib/ruby-spacy.rb', line 442

def py_matcher
  @py_matcher
end

#spacy_matcher_id ⇒ `String` (readonly)

Returns an identifier string that can be used when referring to the Python object inside PyCall::exec or PyCall::eval.

Returns:

(String) —
an identifier string that can be used when referring to the Python object inside PyCall::exec or PyCall::eval



439
440
441

# File 'lib/ruby-spacy.rb', line 439

def spacy_matcher_id
  @spacy_matcher_id
end

Instance Method Details

#add(text, pattern) ⇒ `Object`

Adds a label string and a text pattern.

Parameters:

text (String) —
a label string given to the pattern
pattern (Array<Array<Hash>>) —
alternative sequences of text patterns



455
456
457

# File 'lib/ruby-spacy.rb', line 455

def add(text, pattern)
  @py_matcher.add(text, pattern)
end

#match(doc) ⇒ `Array<Hash{:match_id => Integer, :start_index => Integer, :end_index => Integer}>`

Execute the match.

Parameters:

doc (Doc) —
An Doc instance

Returns:

(Array<Hash{:match_id => Integer, :start_index => Integer, :end_index => Integer}>) —
The id of the matched pattern, the starting position, and the end position

# File 'lib/ruby-spacy.rb', line 462

def match(doc)
  str_results = PyCall.eval("#{@spacy_matcher_id}(#{doc.spacy_doc_id})").to_s
  s = StringScanner.new(str_results[1..-2])
  results = []
  while s.scan_until(/(\d+), (\d+), (\d+)/)
    next unless s.matched
    triple = s.matched.split(", ")
    match_id = triple[0].to_i
    start_index = triple[1].to_i
    end_index = triple[2].to_i - 1
    results << {match_id: match_id, start_index: start_index, end_index: end_index}
  end
  results
end

Class: Spacy::Matcher

Overview

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(nlp_id) ⇒ Matcher

Instance Attribute Details

#py_matcher ⇒ Object (readonly)

#spacy_matcher_id ⇒ String (readonly)

Instance Method Details

#add(text, pattern) ⇒ Object

#match(doc) ⇒ Array<Hash{:match_id => Integer, :start_index => Integer, :end_index => Integer}>

#initialize(nlp_id) ⇒ `Matcher`

#py_matcher ⇒ `Object` (readonly)

#spacy_matcher_id ⇒ `String` (readonly)

#add(text, pattern) ⇒ `Object`

#match(doc) ⇒ `Array<Hash{:match_id => Integer, :start_index => Integer, :end_index => Integer}>`