Class: String

Inherits:
Object show all
Defined in:
lib/simple_ext/object/blank.rb,
lib/simple_ext/string/access.rb,
lib/simple_ext/string/filters.rb

Constant Summary collapse

BLANK_RE =
/\A[[:space:]]*\z/
ENCODED_BLANKS =
Concurrent::Map.new do |h, enc|
  h[enc] = Regexp.new(BLANK_RE.source.encode(enc), BLANK_RE.options | Regexp::FIXEDENCODING)
end

Instance Method Summary collapse

Instance Method Details

#at(position) ⇒ Object

If you pass a single integer, returns a substring of one character at that position. The first character of the string is at position 0, the next at position 1, and so on. If a range is supplied, a substring containing characters at offsets given by the range is returned. In both cases, if an offset is negative, it is counted from the end of the string. Returns nil if the initial offset falls outside the string. Returns an empty string if the beginning of the range is greater than the end of the string.

str = "hello"
str.at(0)      # => "h"
str.at(1..3)   # => "ell"
str.at(-2)     # => "l"
str.at(-2..-1) # => "lo"
str.at(5)      # => nil
str.at(5..-1)  # => ""

If a Regexp is given, the matching portion of the string is returned. If a String is given, that given string is returned if it occurs in the string. In both cases, nil is returned if there is no match.

str = "hello"
str.at(/lo/) # => "lo"
str.at(/ol/) # => nil
str.at("lo") # => "lo"
str.at("ol") # => nil


29
30
31
# File 'lib/simple_ext/string/access.rb', line 29

def at(position)
  self[position]
end

#blank?true, false

A string is blank if it’s empty or contains whitespaces only:

''.blank?       # => true
'   '.blank?    # => true
"\t\n\r".blank? # => true
' blah '.blank? # => false

Unicode whitespace is supported:

"\u00a0".blank? # => true

Returns:

  • (true, false)


120
121
122
123
124
125
126
127
128
129
130
# File 'lib/simple_ext/object/blank.rb', line 120

def blank?
  # The regexp that matches blank strings is expensive. For the case of empty
  # strings we can speed up this method (~3.5x) with an empty? call. The
  # penalty for the rest of strings is marginal.
  empty? ||
    begin
      BLANK_RE.match?(self)
    rescue Encoding::CompatibilityError
      ENCODED_BLANKS[self.encoding].match?(self)
    end
end

#exclude?(string) ⇒ Boolean

The inverse of String#include?. Returns true if the string does not include the other string.

"hello".exclude? "lo" # => false
"hello".exclude? "ol" # => true
"hello".exclude? ?h   # => false

Returns:

  • (Boolean)


102
103
104
# File 'lib/simple_ext/string/access.rb', line 102

def exclude?(string)
  !include?(string)
end

#first(limit = 1) ⇒ Object

Returns the first character. If a limit is supplied, returns a substring from the beginning of the string until it reaches the limit value. If the given limit is greater than or equal to the string length, returns a copy of self.

str = "hello"
str.first    # => "h"
str.first(1) # => "h"
str.first(2) # => "he"
str.first(0) # => ""
str.first(6) # => "hello"


78
79
80
# File 'lib/simple_ext/string/access.rb', line 78

def first(limit = 1)
  self[0, limit] || raise(ArgumentError, "negative limit")
end

#from(position) ⇒ Object

Returns a substring from the given position to the end of the string. If the position is negative, it is counted from the end of the string.

str = "hello"
str.from(0)  # => "hello"
str.from(3)  # => "lo"
str.from(-2) # => "lo"

You can mix it with to method and do fun things like:

str = "hello"
str.from(0).to(-1) # => "hello"
str.from(1).to(-2) # => "ell"


46
47
48
# File 'lib/simple_ext/string/access.rb', line 46

def from(position)
  self[position, length]
end

#last(limit = 1) ⇒ Object

Returns the last character of the string. If a limit is supplied, returns a substring from the end of the string until it reaches the limit value (counting backwards). If the given limit is greater than or equal to the string length, returns a copy of self.

str = "hello"
str.last    # => "o"
str.last(1) # => "o"
str.last(2) # => "lo"
str.last(0) # => ""
str.last(6) # => "hello"


92
93
94
# File 'lib/simple_ext/string/access.rb', line 92

def last(limit = 1)
  self[[length - limit, 0].max, limit] || raise(ArgumentError, "negative limit")
end

#remove(*patterns) ⇒ Object

Returns a new string with all occurrences of the patterns removed.

str = "foo bar test"
str.remove(" test")                 # => "foo bar"
str.remove(" test", /bar/)          # => "foo "
str                                 # => "foo bar test"


32
33
34
# File 'lib/simple_ext/string/filters.rb', line 32

def remove(*patterns)
  dup.remove!(*patterns)
end

#remove!(*patterns) ⇒ Object

Alters the string by removing all occurrences of the patterns.

str = "foo bar test"
str.remove!(" test", /bar/)         # => "foo "
str                                 # => "foo "


40
41
42
43
44
45
46
# File 'lib/simple_ext/string/filters.rb', line 40

def remove!(*patterns)
  patterns.each do |pattern|
    gsub! pattern, ""
  end

  self
end

#squishObject

Returns the string, first removing all whitespace on both ends of the string, and then changing remaining consecutive whitespace groups into one space each.

Note that it handles both ASCII and Unicode whitespace.

%{ Multi-line
   string }.squish                   # => "Multi-line string"
" foo   bar    \n   \t   boo".squish # => "foo bar boo"


13
14
15
# File 'lib/simple_ext/string/filters.rb', line 13

def squish
  dup.squish!
end

#squish!Object

Performs a destructive squish. See String#squish.

str = " foo   bar    \n   \t   boo"
str.squish!                         # => "foo bar boo"
str                                 # => "foo bar boo"


21
22
23
24
25
# File 'lib/simple_ext/string/filters.rb', line 21

def squish!
  gsub!(/[[:space:]]+/, " ")
  strip!
  self
end

#to(position) ⇒ Object

Returns a substring from the beginning of the string to the given position. If the position is negative, it is counted from the end of the string.

str = "hello"
str.to(0)  # => "h"
str.to(3)  # => "hell"
str.to(-2) # => "hell"

You can mix it with from method and do fun things like:

str = "hello"
str.from(0).to(-1) # => "hello"
str.from(1).to(-2) # => "ell"


63
64
65
66
# File 'lib/simple_ext/string/access.rb', line 63

def to(position)
  position += size if position < 0
  self[0, position + 1] || +""
end

#truncate(truncate_at, options = {}) ⇒ Object

Truncates a given text after a given length if text is longer than length:

'Once upon a time in a world far far away'.truncate(27)
# => "Once upon a time in a wo..."

Pass a string or regexp :separator to truncate text at a natural break:

'Once upon a time in a world far far away'.truncate(27, separator: ' ')
# => "Once upon a time in a..."

'Once upon a time in a world far far away'.truncate(27, separator: /\s/)
# => "Once upon a time in a..."

The last characters will be replaced with the :omission string (defaults to “…”) for a total length not exceeding length:

'And they found that many people were sleeping better.'.truncate(25, omission: '... (continued)')
# => "And they f... (continued)"


66
67
68
69
70
71
72
73
74
75
76
77
78
79
# File 'lib/simple_ext/string/filters.rb', line 66

def truncate(truncate_at, options = {})
  return dup unless length > truncate_at

  omission = options[:omission] || "..."
  length_with_room_for_omission = truncate_at - omission.length
  stop = \
    if options[:separator]
      rindex(options[:separator], length_with_room_for_omission) || length_with_room_for_omission
    else
      length_with_room_for_omission
    end

  +"#{self[0, stop]}#{omission}"
end

#truncate_bytes(truncate_at, omission: "…") ⇒ Object

Truncates text to at most bytesize bytes in length without breaking string encoding by splitting multibyte characters or breaking grapheme clusters (“perceptual characters”) by truncating at combining characters.

>> "🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪".size
=> 20
>> "🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪".bytesize
=> 80
>> "🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪🔪".truncate_bytes(20)
=> "🔪🔪🔪🔪…"

The truncated text ends with the :omission string, defaulting to “…”, for a total length not exceeding bytesize.



95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
# File 'lib/simple_ext/string/filters.rb', line 95

def truncate_bytes(truncate_at, omission: "")
  omission ||= ""

  case
  when bytesize <= truncate_at
    dup
  when omission.bytesize > truncate_at
    raise ArgumentError, "Omission #{omission.inspect} is #{omission.bytesize}, larger than the truncation length of #{truncate_at} bytes"
  when omission.bytesize == truncate_at
    omission.dup
  else
    self.class.new.tap do |cut|
      cut_at = truncate_at - omission.bytesize

      scan(/\X/) do |grapheme|
        if cut.bytesize + grapheme.bytesize <= cut_at
          cut << grapheme
        else
          break
        end
      end

      cut << omission
    end
  end
end

#truncate_words(words_count, options = {}) ⇒ Object

Truncates a given text after a given number of words (words_count):

'Once upon a time in a world far far away'.truncate_words(4)
# => "Once upon a time..."

Pass a string or regexp :separator to specify a different separator of words:

'Once<br>upon<br>a<br>time<br>in<br>a<br>world'.truncate_words(5, separator: '<br>')
# => "Once<br>upon<br>a<br>time<br>in..."

The last characters will be replaced with the :omission string (defaults to “…”):

'And they found that many people were sleeping better.'.truncate_words(5, omission: '... (continued)')
# => "And they found that many... (continued)"


136
137
138
139
140
141
142
143
144
# File 'lib/simple_ext/string/filters.rb', line 136

def truncate_words(words_count, options = {})
  sep = options[:separator] || /\s+/
  sep = Regexp.escape(sep.to_s) unless Regexp === sep
  if self =~ /\A((?>.+?#{sep}){#{words_count - 1}}.+?)#{sep}.*/m
    $1 + (options[:omission] || "...")
  else
    dup
  end
end