Class: String

Inherits:
Object show all
Defined in:
lib/extensions/remove_accents.rb,
lib/extensions/string.rb

Overview

RemoveAccents version 1.0.3 © 2008-2009 Solutions Informatiques Techniconseils inc.

This module adds 2 methods to the string class. Up-to-date version and documentation available at:

www.techniconseils.ca/en/scripts-remove-accents-ruby.php

This script is available under the following license : Creative Commons Attribution-Share Alike 2.5.

See full license and details at : creativecommons.org/licenses/by-sa/2.5/ca/

Version history:

* 1.0.3 : July 23 2009
            Corrected some incorrect character codes. Source is now wikipedia at:
              http://en.wikipedia.org/wiki/ISO/IEC_8859-1#Related_character_maps
            Thanks to Raimon Fernandez for pointing out the incorrect codes.
* 1.0.2 : October 29 2008
            Slightly optimized version of urlize - Jonathan Grenier ([email protected])
* 1.0.1 : October 29 2008
            First public revision - Jonathan Grenier ([email protected])

Constant Summary collapse

ACCENTS_MAPPING =

The extended characters map used by removeaccents. The accented characters are coded here using their numerical equivalent to sidestep encoding issues. These correspond to ISO-8859-1 encoding.

{
  'E' => [200,201,202,203],
  'e' => [232,233,234,235],
  'A' => [192,193,194,195,196,197],
  'a' => [224,225,226,227,228,229,230],
  'C' => [199],
  'c' => [231],
  'O' => [210,211,212,213,214,216],
  'o' => [242,243,244,245,246,248],
  'I' => [204,205,206,207],
  'i' => [236,237,238,239],
  'U' => [217,218,219,220],
  'u' => [249,250,251,252],
  'N' => [209],
  'n' => [241],
  'Y' => [221],
  'y' => [253,255],
  'AE' => [306],
  'ae' => [346],
  'OE' => [188],
  'oe' => [189]
}

Class Method Summary collapse

Instance Method Summary collapse

Class Method Details

.random_alphanumeric_string(length, options = {}) ⇒ Object



2
3
4
5
6
7
8
9
10
# File 'lib/extensions/string.rb', line 2

def self.random_alphanumeric_string( length, options = {} )
  valid_chars = []
  valid_chars += ("A"[0].."Z"[0]).to_a if options[:include_upper_case].nil? || options[:include_upper_case]
  valid_chars += ("a"[0].."z"[0]).to_a if options[:include_lower_case].nil? || options[:include_lower_case]
  valid_chars += ("0"[0].."9"[0]).to_a if options[:include_numbers].nil? || options[:include_numbers]
  str = ""
  length.times{ str += valid_chars[rand(valid_chars.size)].chr }
  str
end

Instance Method Details

#html_escapeObject



31
32
33
# File 'lib/extensions/string.rb', line 31

def html_escape
  gsub(/&/, "&amp;").gsub(/\"/, "&quot;").gsub(/>/, "&gt;").gsub(/</, "&lt;")
end

#removeaccentsObject

Remove the accents from the string. Uses String::ACCENTS_MAPPING as the source map.



54
55
56
57
58
59
60
61
62
63
# File 'lib/extensions/remove_accents.rb', line 54

def removeaccents    
  str = String.new(self)
  String::ACCENTS_MAPPING.each {|letter,accents|
    packed = accents.pack('U*')
    rxp = Regexp.new("[#{packed}]", nil, 'U')
    str.gsub!(rxp, letter)
  }
  
  str
end

#strip_htmlObject



16
17
18
# File 'lib/extensions/string.rb', line 16

def strip_html
  gsub %r{</?[^>]+?>}, ''   
end

#strip_javascriptObject



12
13
14
# File 'lib/extensions/string.rb', line 12

def strip_javascript
  gsub(/<script.*?>[\s\S]*<\/script>/i, "")
end

#to_boolObject



20
21
22
23
24
25
26
27
28
29
# File 'lib/extensions/string.rb', line 20

def to_bool
  downcased_str = downcase
  if [true, 'true', 1, '1', 'T', 't'].include?( downcased_str )
    true
  elsif [false, 'false', 0, '0', 'F', 'f'].include?( downcased_str )
    false
  else
    raise "string with value #{self} is not convertible to boolean"
  end
end

#to_upper_camelcaseObject

converts a snake case string to upper camel case



43
44
45
46
47
48
49
# File 'lib/extensions/string.rb', line 43

def to_upper_camelcase
    split_word_array = split '_'
    for word in split_word_array
      word[0] = word[0].chr.capitalize
    end
    split_word_array.join
end

#truncate(options = {}) ⇒ Object



35
36
37
38
39
40
# File 'lib/extensions/string.rb', line 35

def truncate options={}
  options.reverse_merge!(:length => 30, :omission => "...")
  len = options[:length] - options[:omission].length
  chars = to_s
  (chars.length > options[:length] ? chars[0...len] + options[:omission] : to_s).to_s
end

#urlize(options = {}) ⇒ Object

Convert a string to a format suitable for a URL without ever using escaped characters. It calls strip, removeaccents, downcase (optional) then removes the spaces (optional) and finally removes any characters matching the default regexp (/[^-_A-Za-z0-9]/).

Options

  • :downcase => call downcase on the string (defaults to true)

  • :convert_spaces => Convert space to underscore (defaults to false)

  • :regexp => The regexp matching characters that will be converting to an empty string (defaults to /[^-_A-Za-z0-9]/)



75
76
77
78
79
80
81
82
83
84
# File 'lib/extensions/remove_accents.rb', line 75

def urlize(options = {})
  options[:downcase] ||= true
  options[:convert_spaces] ||= false
  options[:regexp] ||= /[^-_A-Za-z0-9]/
  
  str = self.strip.removeaccents
  str.downcase! if options[:downcase]
  str.gsub!(/\ /,'_') if options[:convert_spaces]
  str.gsub(options[:regexp], '')
end