Module: EStem

Included in:
String
Defined in:
lib/estem.rb

Overview

:title: Spanish Stemming

Description

This gem is for reducing Spanish words to their roots. It uses an algorithm based on Martin Porter’s specifications.

For more information, visit: snowball.tartarus.org/algorithms/spanish/stemmer.html

Descripción

Esta gema está para reducir las palabras del Español en sus respectivas raíces, para ello ultiliza un algoritmo basado en las especificaciones de Martin Porter

Para más información, visite: snowball.tartarus.org/algorithms/spanish/stemmer.html

License – Licencia

This code is provided under the terms of the MIT License.

Authors

* Manuel A. G

Instance Method Summary collapse

Instance Method Details

#es_stemObject

This method stem Spanish words.

"albergues".es_stem      # ==> "alberg"
"habitaciones".es_stem   # ==> "habit"
"ALbeRGues".es_stem      # ==> "ALbeRG"
"HaBiTaCiOnEs".es_stem   # ==> "HaBiT"
"Hacinamiento".es_stem   # ==> "Hacin"

:call-seq: str.es_stem => “new_str”



43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
# File 'lib/estem.rb', line 43

def es_stem
  str = self.dup
  return remove_accent(str) if str.length == 1
  tmp = step0(str)
  str = tmp ? tmp : str

  unless tmp = step1(str)
    unless tmp = step2a(str)
      tmp = step2b(str)
      str = tmp ? tmp : str
    else
      str = tmp
    end
  end
  tmp = step3(str)
  str = tmp.nil? ? str : tmp
  remove_accent(str)
end