Class: GentleBrute::WordAnalyzer

Inherits:
Object
  • Object
show all
Defined in:
lib/gentle_brute/word_analyzer.rb

Instance Method Summary collapse

Constructor Details

#initialize(cpa_threshold) ⇒ WordAnalyzer

Returns a new instance of WordAnalyzer.



3
4
5
6
7
8
9
10
# File 'lib/gentle_brute/word_analyzer.rb', line 3

def initialize(cpa_threshold)
  @suffixes = File.read(File.expand_path('../resources/suffixes', __FILE__)).split "\n"
  @vowels = ['a', 'e', 'i', 'o', 'u']
  @chars = ('a'..'z').to_a + ["'"]
  @punctuation = ["!", "?", ".", ","]
  @cpa_threshold = cpa_threshold
  @cpa_analyzer = CPAAnalyzer.new
end

Instance Method Details

#follows_proper_suffix_patterns?(word) ⇒ Boolean

Test whether or not a given word follows proper suffix usage patterns

Parameters:

  • word (String)

    the word to test

Returns:

  • (Boolean)

    whether or not the word passed the test



100
101
102
103
104
105
106
107
108
# File 'lib/gentle_brute/word_analyzer.rb', line 100

def follows_proper_suffix_patterns? word
  @suffixes.each do | suffix |
    next if not word =~ /#{suffix}$/
    return true if word.length == suffix.length
    index = (suffix.length * -1) - 1
    return false if word[index] == suffix[0]
  end
  true
end

#has_vowel?(word) ⇒ Boolean

Test whether or not a given word has at least one vowel

Parameters:

  • word (String)

    the word to test

Returns:

  • (Boolean)

    whether or not the word passed the test



79
80
81
# File 'lib/gentle_brute/word_analyzer.rb', line 79

def has_vowel? word
  @vowels.any? { |vowel| word.include? vowel }
end

#is_valid_phrase?(phrase) ⇒ Boolean

Test whether or not a given phrase follows the rules of English-like phrases

Parameters:

  • phrase (String)

    the phrase to test for validity

Returns:

  • (Boolean)

    whether or not the phrase passed all the validation tests



46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
# File 'lib/gentle_brute/word_analyzer.rb', line 46

def is_valid_phrase? phrase
  # Does the phrase start or end with a space?
  return false if phrase =~ /^ /
  return false if phrase =~ / $/
  return false if phrase.include? "-" and phrase.include? "_"
  return false if phrase.include? "   "
  
  phrases = phrase.split "  "
  phrases.each do | phrase_ |
    return false if phrase_.split(" ").length == 1 # phrases can't be just one word
    return false if not passes_phrase_pattern_test? phrase_
    phrase_.split(" ").each do | word |
      return false if not is_valid_word? word
    end
  end
  
  true
end

#is_valid_word?(word) ⇒ Boolean

Test whether or not a given word follows the rules of English-like words

Parameters:

  • word (String)

    the word to test for validity

Returns:

  • (Boolean)

    whether or not the word passed all the validation tests



15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# File 'lib/gentle_brute/word_analyzer.rb', line 15

def is_valid_word? word
  return false if word.empty? # Empty words are not valid words
  word.downcase! # ignore mixed casing
  return is_valid_phrase? word if word.include? " " # If it's a phrase, test it as a phrase
  length = word.length

  # The only valid single letter words are 'a' and 'i'
  return false if length == 1 and word != "i" and word != "a"

  # The word must contain at least one vowel
  return false if not has_vowel? word

  # Validate Character Position Analysis Scores
  return false if not passes_neighbor_tests? word

  # The word must conform to proper apostrophe usage rules
  return false if not uses_valid_apostrophes? word

  # Does the word follow proper suffix patterns?
  return false if not follows_proper_suffix_patterns? word

  # Does the word contain triple char patterns?
  return false if word.length > 5 and not passes_direct_patterns_test? word

  # The word is (probably) valid!
  true
end

#passes_direct_patterns_test?(word) ⇒ Boolean

Verify a given word does not have any illegal character patterns

Parameters:

  • word (String)

    the word to test

Returns:

  • (Boolean)

    whether or not the word passed the test



113
114
115
116
117
118
# File 'lib/gentle_brute/word_analyzer.rb', line 113

def passes_direct_patterns_test? word
  pattern_data = PatternFinder.patterns_in_strintg word
  return true if not pattern_data
  return false if pattern_data[3] > 2
  true
end

#passes_neighbor_tests?(word) ⇒ Boolean

Verify a given word has all the required CPA neighbor scores

Parameters:

  • word (String)

    the word to test

Returns:

  • (Boolean)

    whether or not the word passed the test



123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
# File 'lib/gentle_brute/word_analyzer.rb', line 123

def passes_neighbor_tests? word
  length = word.length
  for i in 0...length
    char = word[i]
    char_r = word[i+1]
    char_rr = word[i+2]
    return false if char == char_r and char == char_rr
    return false if char == char_r and length == 2
    return false if not @chars.include? char
  
    if char_r
      begin
        if i == 0
          return false if @cpa_analyzer.get_starter_neighbor_score(char, char_r)[0] == 0
        elsif i == length-3
          return false if @cpa_analyzer.get_ender_neighbor_score(char, char_r)[0] == 0
        end
        score = @cpa_analyzer.get_neighbor_score(char, char_r)
        return false if score[0] < @cpa_threshold
      rescue
      end
    end
  
    if char_rr
      begin
        if i == 0
          return false if  @cpa_analyzer.get_starter_neighbor_score(char, char_rr)[1] == 0
        elsif i == length-3
          return false if @cpa_analyzer.get_ender_neighbor_score(char, char_rr)[1] == 0
        end
        score = @cpa_analyzer.get_neighbor_score(char, char_rr)
        return false if score[1] < @cpa_threshold
      rescue
      end
    end
  end
  
  true
end

#passes_phrase_pattern_test?(phrase) ⇒ Boolean

Begin Phrase Testing Functions ###

Returns:

  • (Boolean)


67
68
69
70
71
72
# File 'lib/gentle_brute/word_analyzer.rb', line 67

def passes_phrase_pattern_test? phrase
  pattern_data = PatternFinder.patterns_in_strintg phrase
  return true if pattern_data == nil
  return false if pattern_data[3] > 3
  true
end

#uses_valid_apostrophes?(word) ⇒ Boolean

Test whether or not a given word follows proper apostrophe usage rules

Parameters:

  • word (String)

    the word to test

Returns:

  • (Boolean)

    whether or not the word passed the test



86
87
88
89
90
91
92
93
94
95
# File 'lib/gentle_brute/word_analyzer.rb', line 86

def uses_valid_apostrophes? word
  if word.include? "'"
    length = word.length
    index = word.index "'"
    return false if length-1 != word.tr("'", "").length
    return false if index != length-1 and index != length-2
    return false if word[-1] != 's' and word[-2] != 's' and word[-1] != 't' and word[-1] != 'm' and word[-1] != 'd'
  end
  true
end