Class: String

Inherits:
Object show all
Defined in:
lib/bioinform/support/strip_doc.rb,
lib/bioinform/support/multiline_squish.rb,
lib/bioinform/support/third_part/active_support/core_ext/string/access.rb,
lib/bioinform/support/third_part/active_support/core_ext/string/filters.rb,
lib/bioinform/support/third_part/active_support/core_ext/string/behavior.rb,
lib/bioinform/support/third_part/active_support/core_ext/string/multibyte.rb

Instance Method Summary collapse

Instance Method Details

#acts_like_string?Boolean

Enable more predictable duck-typing on String-like classes. See Object#acts_like?.

Returns:

  • (Boolean)


3
4
5
# File 'lib/bioinform/support/third_part/active_support/core_ext/string/behavior.rb', line 3

def acts_like_string?
  true
end

#at(position) ⇒ Object

Returns the character at the position treating the string as an array (where 0 is the first character).

Examples:

"hello".at(0)  # => "h"
"hello".at(4)  # => "o"
"hello".at(10) # => ERROR if < 1.9, nil in 1.9


11
12
13
# File 'lib/bioinform/support/third_part/active_support/core_ext/string/access.rb', line 11

def at(position)
  mb_chars[position, 1].to_s
end

#first(limit = 1) ⇒ Object

Returns the first character of the string or the first limit characters.

Examples:

"hello".first     # => "h"
"hello".first(2)  # => "he"
"hello".first(10) # => "hello"


41
42
43
44
45
46
47
48
49
# File 'lib/bioinform/support/third_part/active_support/core_ext/string/access.rb', line 41

def first(limit = 1)
  if limit == 0
    ''
  elsif limit >= size
    self
  else
    mb_chars[0...limit].to_s
  end
end

#from(position) ⇒ Object

Returns the remaining of the string from the position treating the string as an array (where 0 is the first character).

Examples:

"hello".from(0)  # => "hello"
"hello".from(2)  # => "llo"
"hello".from(10) # => "" if < 1.9, nil in 1.9


21
22
23
# File 'lib/bioinform/support/third_part/active_support/core_ext/string/access.rb', line 21

def from(position)
  mb_chars[position..-1].to_s
end

#is_utf8?Boolean

Returns true if the string has UTF-8 semantics (a String used for purely byte resources is unlikely to have them), returns false otherwise.

Returns:

  • (Boolean)


68
69
70
71
72
73
74
75
76
77
# File 'lib/bioinform/support/third_part/active_support/core_ext/string/multibyte.rb', line 68

def is_utf8?
  case encoding
  when Encoding::UTF_8
    valid_encoding?
  when Encoding::ASCII_8BIT, Encoding::US_ASCII
    dup.force_encoding(Encoding::UTF_8).valid_encoding?
  else
    false
  end
end

#last(limit = 1) ⇒ Object

Returns the last character of the string or the last limit characters.

Examples:

"hello".last     # => "o"
"hello".last(2)  # => "lo"
"hello".last(10) # => "hello"


57
58
59
60
61
62
63
64
65
# File 'lib/bioinform/support/third_part/active_support/core_ext/string/access.rb', line 57

def last(limit = 1)
  if limit == 0
    ''
  elsif limit >= size
    self
  else
    mb_chars[(-limit)..-1].to_s
  end
end

#mb_charsObject

Multibyte proxy

mb_chars is a multibyte safe proxy for string methods.

In Ruby 1.8 and older it creates and returns an instance of the ActiveSupport::Multibyte::Chars class which encapsulates the original string. A Unicode safe version of all the String methods are defined on this proxy class. If the proxy class doesn’t respond to a certain method, it’s forwarded to the encapsulated string.

name = 'Claus Müller'
name.reverse # => "rell??M sualC"
name.length  # => 13

name.mb_chars.reverse.to_s # => "rellüM sualC"
name.mb_chars.length       # => 12

In Ruby 1.9 and newer mb_chars returns self because String is (mostly) encoding aware. This means that it becomes easy to run one version of your code on multiple Ruby versions.

Method chaining

All the methods on the Chars proxy which normally return a string will return a Chars object. This allows method chaining on the result of any of these methods.

name.mb_chars.reverse.length # => 12

Interoperability and configuration

The Chars object tries to be as interchangeable with String objects as possible: sorting and comparing between String and Char work like expected. The bang! methods change the internal string representation in the Chars object. Interoperability problems can be resolved easily with a to_s call.

For more information about the methods defined on the Chars proxy see ActiveSupport::Multibyte::Chars. For information about how to change the default Multibyte behavior see ActiveSupport::Multibyte.



39
40
41
42
43
44
45
# File 'lib/bioinform/support/third_part/active_support/core_ext/string/multibyte.rb', line 39

def mb_chars
  if ActiveSupport::Multibyte.proxy_class.consumes?(self)
    ActiveSupport::Multibyte.proxy_class.new(self)
  else
    self
  end
end

#multiline_squishObject



3
4
5
# File 'lib/bioinform/support/multiline_squish.rb', line 3

def multiline_squish
  split("\n").map(&:squish).join("\n").gsub(/\A\n+/,'').gsub(/\n+\z/,'')
end

#squishObject

Returns the string, first removing all whitespace on both ends of the string, and then changing remaining consecutive whitespace groups into one space each.

Examples:

%{ Multi-line
   string }.squish                   # => "Multi-line string"
" foo   bar    \n   \t   boo".squish # => "foo bar boo"


12
13
14
# File 'lib/bioinform/support/third_part/active_support/core_ext/string/filters.rb', line 12

def squish
  dup.squish!
end

#squish!Object

Performs a destructive squish. See String#squish.



17
18
19
20
21
# File 'lib/bioinform/support/third_part/active_support/core_ext/string/filters.rb', line 17

def squish!
  strip!
  gsub!(/\s+/, ' ')
  self
end

#strip_docObject



6
7
8
# File 'lib/bioinform/support/strip_doc.rb', line 6

def strip_doc
  gsub(/^#{self[/\A +/]}/,'')
end

#to(position) ⇒ Object

Returns the beginning of the string up to the position treating the string as an array (where 0 is the first character).

Examples:

"hello".to(0)  # => "h"
"hello".to(2)  # => "hel"
"hello".to(10) # => "hello"


31
32
33
# File 'lib/bioinform/support/third_part/active_support/core_ext/string/access.rb', line 31

def to(position)
  mb_chars[0..position].to_s
end

#truncate(length, options = {}) ⇒ Object

Truncates a given text after a given length if text is longer than length:

"Once upon a time in a world far far away".truncate(27)
# => "Once upon a time in a wo..."

Pass a :separator to truncate text at a natural break:

"Once upon a time in a world far far away".truncate(27, :separator => ' ')
# => "Once upon a time in a..."

The last characters will be replaced with the :omission string (defaults to “…”) for a total length not exceeding :length:

"And they found that many people were sleeping better.".truncate(25, :omission => "... (continued)")
# => "And they f... (continued)"


38
39
40
41
42
43
44
45
46
47
48
# File 'lib/bioinform/support/third_part/active_support/core_ext/string/filters.rb', line 38

def truncate(length, options = {})
  text = self.dup
  options[:omission] ||= "..."

  length_with_room_for_omission = length - options[:omission].mb_chars.length
  chars = text.mb_chars
  stop = options[:separator] ?
    (chars.rindex(options[:separator].mb_chars, length_with_room_for_omission) || length_with_room_for_omission) : length_with_room_for_omission

  (chars.length > length ? chars[0...stop] + options[:omission] : text).to_s
end