Class: Opener::LanguageIdentifier::Backend::LanguageDetection
- Inherits:
-
Object
- Object
- Opener::LanguageIdentifier::Backend::LanguageDetection
- Defined in:
- lib/opener/language_identifier/backend/language_detection.rb
Constant Summary collapse
- DEFAULT_PROFILES_PATH =
Path to the directory containing the default profiles.
File.( '../../../../../core/target/classes/profiles', __FILE__ )
- DEFAULT_SHORT_PROFILES_PATH =
Path to the directory containing the default short profiles.
File.( '../../../../../core/target/classes/short_profiles', __FILE__ )
- PRIORITIES =
Prioritize OpeNER languages over the rest. Languages not covered by this list are automatically given a default priority.
{ 'en' => 1.0, 'es' => 0.9, 'it' => 0.9, 'fr' => 0.9, 'de' => 0.9, 'nl' => 0.9, # These languages are disabled (for the time being) due to conflicting # with other (OpeNER) languages too often. 'af' => 0.0, # conflicts with Dutch }
- DEFAULT_PRIORITY =
The default priority for non OpeNER languages.
0.5
- SHORT_THRESHOLD =
The amount of characters after which the detector should switch to using the longer profiles set.
15
Instance Method Summary collapse
- #detect(input) ⇒ String
-
#initialize ⇒ LanguageDetection
constructor
A new instance of LanguageDetection.
- #new_detector(input) ⇒ Object
Constructor Details
#initialize ⇒ LanguageDetection
Returns a new instance of LanguageDetection.
62 63 64 |
# File 'lib/opener/language_identifier/backend/language_detection.rb', line 62 def initialize @factory = com.cybozu.labs.langdetect.DetectorFactory.new end |
Instance Method Details
#detect(input) ⇒ String
81 82 83 84 85 86 87 88 89 90 |
# File 'lib/opener/language_identifier/backend/language_detection.rb', line 81 def detect input detector = new_detector input detector.detect # The core Java code raise an exception when it can't detect a language. # Since this isn't actually something fatal we'll capture this and return # "unknown" instead. rescue com.cybozu.labs.langdetect.LangDetectException return 'unknown' end |
#new_detector(input) ⇒ Object
66 67 68 69 70 71 72 73 74 75 76 |
# File 'lib/opener/language_identifier/backend/language_detection.rb', line 66 def new_detector input @factory.load_profile determine_profiles input @factory.set_seed 1 priorities = build_priorities input, @factory.langlist detector = com.cybozu.labs.langdetect.Detector.new @factory detector.set_prior_map priorities detector.append input.downcase detector end |