Class: String

Inherits:
Object show all
Defined in:
lib/core_ext/string.rb

Instance Method Summary collapse

Instance Method Details

#blank_to_nilString?

Convert blank strings to nil.

Examples:

"foobar".blank_to_nil   # => "foobar"
" ".blank_to_nil        # => nil
"".blank_to_nil         # => nil
nil.blank_to_nil        # => nil

Returns:

  • (String, nil)

    converted string


12
13
14
# File 'lib/core_ext/string.rb', line 12

def blank_to_nil
  self if present?
end

#cleanupString

Fix messy oddities such as the use of two apostrophes instead of a quote

Examples:

"the ''Terror'' was a fine ship".cleanup   # => "the \"Terror\" was a fine ship"

Returns:


22
23
24
25
26
27
# File 'lib/core_ext/string.rb', line 22

def cleanup
  gsub(/[#{AIXM::MIN}]{2}|[#{AIXM::SEC}]/, '"').   # unify quotes
    gsub(/[#{AIXM::MIN}]/, "'").   # unify apostrophes
    gsub(/"[[:blank:]]*(.*?)[[:blank:]]*"/m, '"\1"').   # remove whitespace within quotes
    split(/\r?\n/).map { _1.strip.blank_to_nil }.compact.join("\n")   # remove blank lines
end

#compactString

Note:

While similar to String#squish from ActiveSupport, newlines \n are preserved and not collapsed into one space.

Strip and collapse unnecessary whitespace

Examples:

"  foo\n\nbar \r".copact   # => "foo\nbar"

Returns:

  • (String)

    compacted string


38
39
40
# File 'lib/core_ext/string.rb', line 38

def compact
  split("\n").map { _1.squish.blank_to_nil }.compact.join("\n")
end

#correlate(other, synonyms = []) ⇒ Integer

Calculate the correlation of two strings by counting mutual words

Both strings are normalized as follows:

  • remove accents, umlauts etc

  • remove everything but members of the \w class

  • downcase

The normalized strings are split into words. Only words fulfilling either of the following conditions are taken into consideration:

  • words present in and translated by the synonyms map

  • words of at least 5 characters length

  • words consisting of exactly one letter followed by any number of digits (an optional whitespace between the two is ignored, e.g. “D 25” is the same as “D25”)

The synonyms map is an array where terms in even positions map to their synonym in the following (odd) position:

SYNONYMS = ['term1', 'synonym1', 'term2', 'synonym2']

Examples:

subject = "Truck en route on N 3 sud"
subject.correlate("my car is on D25")          # => 0
subject.correlate("my truck is on D25")        # => 1
subject.correlate("my truck is on N3")         # => 2
subject.correlate("south", ['sud', 'south'])   # => 1

Parameters:

  • other (String)

    string to compare with

  • synonyms (Array<String>) (defaults to: [])

    array of synonym pairs

Returns:

  • (Integer)

    0 for unrelated strings and positive integers for related strings with higher numbers indicating tighter correlation


73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
# File 'lib/core_ext/string.rb', line 73

def correlate(other, synonyms=[])
  self_words, other_words = [self, other].map do |string|
    string.
      unicode_normalize(:nfd).
      downcase.gsub(/[-\u2013]/, ' ').
      remove(/[^\w\s]/).
      gsub(/\b(\w)\s?(\d+)\b/, '\1\2').
      compact.
      split(/\W+/).
      map { (i = synonyms.index(_1)).nil? ? _1 : (i.odd? ? _1 : synonyms[i + 1]).upcase }.
      keep_if { _1.match?(/\w{5,}|\w\d+|[[:upper:]]/) }.
      uniq
  end
  (self_words & other_words).count
end

#extract(pattern) ⇒ Object

Similar to scan, but remove matches from the string


96
97
98
# File 'lib/core_ext/string.rb', line 96

def extract(pattern)
  scan(pattern).tap { remove! pattern }
end

#full_stripObject

Similar to strip, but remove any leading or trailing non-letters/numbers which includes whitespace


91
92
93
# File 'lib/core_ext/string.rb', line 91

def full_strip
  remove(/\A[^\p{L}\p{N}]*|[^\p{L}\p{N}]*\z/)
end

#to_ffFloat

Same as to_f but accept both dot and comma as decimal separator

Examples:

"5.5".to_ff    # => 5.5
"5,6".to_ff    # => 5.6
"5,6".to_f     # => 5.0   (sic!)

Returns:

  • (Float)

    number parsed from text


108
109
110
# File 'lib/core_ext/string.rb', line 108

def to_ff
  sub(/,/, '.').to_f
end

#unglueString

Add spaces between obviously glued words:

  • camel glued words

  • three-or-more-letter and number-only words

Examples:

"thisString has spaceProblems".unglue   # => "this String has space problems"
"the first123meters of D25".unglue      # => "the first 123 meters of D25"

Returns:


121
122
123
124
125
126
127
# File 'lib/core_ext/string.rb', line 121

def unglue
  self.dup.tap do |string|
    [/([[:lower:]])([[:upper:]])/, /([[:alpha:]]{3,})(\d)/, /(\d)([[:alpha:]]{3,})/].freeze.each do |regexp|
      string.gsub!(regexp, '\1 \2')
    end
  end
end