Class: String
- Defined in:
- lib/rbot/irc.rb,
 lib/rbot/irc.rb,
 lib/rbot/irc.rb,
 lib/rbot/irc.rb,
 lib/rbot/irc.rb,
 lib/rbot/botuser.rb,
 lib/rbot/ircsocket.rb,
 lib/rbot/core/utils/extends.rb
Overview
Extensions to the String class
TODO make riphtml() just call ircify_html() with stronger purify options.
Instance Method Summary collapse
- 
  
    
      #get_html_title  ⇒ Object 
    
    
  
  
  
  
  
  
  
  
  
    This method tries to find an HTML title in the string, and returns it if found. 
- 
  
    
      #has_irc_glob?  ⇒ Boolean 
    
    
  
  
  
  
  
  
  
  
  
    This method checks if the receiver contains IRC glob characters. 
- 
  
    
      #irc_downcase(casemap = 'rfc1459')  ⇒ Object 
    
    
  
  
  
  
  
  
  
  
  
    This method returns a string which is the downcased version of the receiver, according to the given casemap. 
- 
  
    
      #irc_downcase!(casemap = 'rfc1459')  ⇒ Object 
    
    
  
  
  
  
  
  
  
  
  
    This is the same as the above, except that the string is altered in place. 
- 
  
    
      #irc_send_penalty  ⇒ Object 
    
    
  
  
  
  
  
  
  
  
  
    Calculate the penalty which will be assigned to this message by the IRCd. 
- 
  
    
      #irc_upcase(casemap = 'rfc1459')  ⇒ Object 
    
    
  
  
  
  
  
  
  
  
  
    Upcasing functions are provided too. 
- 
  
    
      #irc_upcase!(casemap = 'rfc1459')  ⇒ Object 
    
    
  
  
  
  
  
  
  
  
  
    In-place upcasing. 
- 
  
    
      #ircify_html(opts = {})  ⇒ Object 
    
    
  
  
  
  
  
  
  
  
  
    This method will return a purified version of the receiver, with all HTML stripped off and some of it converted to IRC formatting. 
- 
  
    
      #ircify_html!(opts = {})  ⇒ Object 
    
    
  
  
  
  
  
  
  
  
  
    As above, but modify the receiver. 
- 
  
    
      #ircify_html_title  ⇒ Object 
    
    
  
  
  
  
  
  
  
  
  
    This method returns the IRC-formatted version of an HTML title found in the string. 
- 
  
    
      #riphtml  ⇒ Object 
    
    
  
  
  
  
  
  
  
  
  
    This method will strip all HTML crud from the receiver. 
- 
  
    
      #to_irc_auth_command  ⇒ Object 
    
    
  
  
  
  
  
  
  
  
  
    Returns an Irc::Bot::Auth::Comand from the receiver. 
- 
  
    
      #to_irc_casemap  ⇒ Object 
    
    
  
  
  
  
  
  
  
  
  
    This method returns the Irc::Casemap whose name is the receiver. 
- 
  
    
      #to_irc_channel(opts = {})  ⇒ Object 
    
    
  
  
  
  
  
  
  
  
  
    We keep extending String, this time adding a method that converts a String into an Irc::Channel object. 
- 
  
    
      #to_irc_channel_topic  ⇒ Object 
    
    
  
  
  
  
  
  
  
  
  
    Returns an Irc::Channel::Topic with self as text. 
- 
  
    
      #to_irc_netmask(opts = {})  ⇒ Object 
    
    
  
  
  
  
  
  
  
  
  
    We keep extending String, this time adding a method that converts a String into an Irc::Netmask object. 
- 
  
    
      #to_irc_regexp  ⇒ Object 
    
    
  
  
  
  
  
  
  
  
  
    This method is used to convert the receiver into a Regular Expression that matches according to the IRC glob syntax. 
- 
  
    
      #to_irc_user(opts = {})  ⇒ Object 
    
    
  
  
  
  
  
  
  
  
  
    We keep extending String, this time adding a method that converts a String into an Irc::User object. 
- 
  
    
      #wrap_nonempty(pre, post, opts = {})  ⇒ Object 
    
    
  
  
  
  
  
  
  
  
  
    This method is used to wrap a nonempty String by adding the prefix and postfix. 
Instance Method Details
#get_html_title ⇒ Object
This method tries to find an HTML title in the string, and returns it if found
| 338 339 340 341 342 343 344 345 | # File 'lib/rbot/core/utils/extends.rb', line 338 def get_html_title if defined? ::Hpricot Hpricot(self).at("title").inner_html else return unless Irc::Utils::TITLE_REGEX.match(self) $1 end end | 
#has_irc_glob? ⇒ Boolean
This method checks if the receiver contains IRC glob characters
IRC has a very primitive concept of globs: a * stands for “any number of arbitrary characters”, a ? stands for “one and exactly one arbitrary character”. These characters can be escaped by prefixing them with a slash (\).
A known limitation of this glob syntax is that there is no way to escape the escape character itself, so it’s not possible to build a glob pattern where the escape character precedes a glob.
| 332 333 334 | # File 'lib/rbot/irc.rb', line 332 def has_irc_glob? self =~ /^[*?]|[^\\][*?]/ end | 
#irc_downcase(casemap = 'rfc1459') ⇒ Object
This method returns a string which is the downcased version of the receiver, according to the given casemap
| 289 290 291 292 | # File 'lib/rbot/irc.rb', line 289 def irc_downcase(casemap='rfc1459') cmap = casemap.to_irc_casemap self.tr(cmap.upper, cmap.lower) end | 
#irc_downcase!(casemap = 'rfc1459') ⇒ Object
This is the same as the above, except that the string is altered in place
See also the discussion about irc_downcase
| 298 299 300 301 | # File 'lib/rbot/irc.rb', line 298 def irc_downcase!(casemap='rfc1459') cmap = casemap.to_irc_casemap self.tr!(cmap.upper, cmap.lower) end | 
#irc_send_penalty ⇒ Object
Calculate the penalty which will be assigned to this message by the IRCd
| 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | # File 'lib/rbot/ircsocket.rb', line 14 def irc_send_penalty # According to eggdrop, the initial penalty is penalty = 1 + self.size/100 # on everything but UnderNET where it's # penalty = 2 + self.size/120 cmd, pars = self.split($;,2) debug "cmd: #{cmd}, pars: #{pars.inspect}" case cmd.to_sym when :KICK chan, nick, msg = pars.split chan = chan.split(',') nick = nick.split(',') penalty += nick.size penalty *= chan.size when :MODE chan, modes, argument = pars.split extra = 0 if modes extra = 1 if argument extra += modes.split(/\+|-/).size else extra += 3 * modes.split(/\+|-/).size end end if argument extra += 2 * argument.split.size end penalty += extra * chan.split.size when :TOPIC penalty += 1 penalty += 2 unless pars.split.size < 2 when :PRIVMSG, :NOTICE dests = pars.split($;,2).first penalty += dests.split(',').size when :WHO args = pars.split if args.length > 0 penalty += args.inject(0){ |sum,x| sum += ((x.length > 4) ? 3 : 5) } else penalty += 10 end when :PART penalty += 4 when :AWAY, :JOIN, :VERSION, :TIME, :TRACE, :WHOIS, :DNS penalty += 2 when :INVITE, :NICK penalty += 3 when :ISON penalty += 1 else # Unknown messages penalty += 1 end if penalty > 99 debug "Wow, more than 99 secs of penalty!" penalty = 99 end if penalty < 2 debug "Wow, less than 2 secs of penalty!" penalty = 2 end debug "penalty: #{penalty}" return penalty end | 
#irc_upcase(casemap = 'rfc1459') ⇒ Object
Upcasing functions are provided too
See also the discussion about irc_downcase
| 307 308 309 310 | # File 'lib/rbot/irc.rb', line 307 def irc_upcase(casemap='rfc1459') cmap = casemap.to_irc_casemap self.tr(cmap.lower, cmap.upper) end | 
#irc_upcase!(casemap = 'rfc1459') ⇒ Object
In-place upcasing
See also the discussion about irc_downcase
| 316 317 318 319 | # File 'lib/rbot/irc.rb', line 316 def irc_upcase!(casemap='rfc1459') cmap = casemap.to_irc_casemap self.tr!(cmap.lower, cmap.upper) end | 
#ircify_html(opts = {}) ⇒ Object
This method will return a purified version of the receiver, with all HTML stripped off and some of it converted to IRC formatting
| 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 | # File 'lib/rbot/core/utils/extends.rb', line 214 def ircify_html(opts={}) txt = self.dup # remove scripts txt.gsub!(/<script(?:\s+[^>]*)?>.*?<\/script>/im, "") # remove styles txt.gsub!(/<style(?:\s+[^>]*)?>.*?<\/style>/im, "") # bold and strong -> bold txt.gsub!(/<\/?(?:b|strong)(?:\s+[^>]*)?>/im, "#{Bold}") # italic, emphasis and underline -> underline txt.gsub!(/<\/?(?:i|em|u)(?:\s+[^>]*)?>/im, "#{Underline}") ## This would be a nice addition, but the results are horrible ## Maybe make it configurable? # txt.gsub!(/<\/?a( [^>]*)?>/, "#{Reverse}") case val = opts[:a_href] when Reverse, Bold, Underline txt.gsub!(/<(?:\/a\s*|a (?:[^>]*\s+)?href\s*=\s*(?:[^>]*\s*)?)>/, val) when :link_out # Not good for nested links, but the best we can do without something like hpricot txt.gsub!(/<a (?:[^>]*\s+)?href\s*=\s*(?:([^"'>][^\s>]*)\s+|"((?:[^"]|\\")*)"|'((?:[^']|\\')*)')(?:[^>]*\s+)?>(.*?)<\/a>/) { |match| debug match debug [$1, $2, $3, $4].inspect link = $1 || $2 || $3 str = $4 str + ": " + link } else warning "unknown :a_href option #{val} passed to ircify_html" if val end # If opts[:img] is defined, it should be a String. Each image # will be replaced by the string itself, replacing occurrences of # %{alt} %{dimensions} and %{src} with the alt text, image dimensions # and URL if val = opts[:img] if val.kind_of? String txt.gsub!(/<img\s+(.*?)\s*\/?>/) do |imgtag| attrs = Hash.new imgtag.scan(/([[:alpha:]]+)\s*=\s*(['"])?(.*?)\2/) do |key, quote, value| k = key.downcase.intern rescue 'junk' attrs[k] = value end attrs[:alt] ||= attrs[:title] attrs[:width] ||= '...' attrs[:height] ||= '...' attrs[:dimensions] ||= "#{attrs[:width]}x#{attrs[:height]}" val % attrs end else warning ":img option is not a string" end end # Paragraph and br tags are converted to whitespace txt.gsub!(/<\/?(p|br)(?:\s+[^>]*)?\s*\/?\s*>/i, ' ') txt.gsub!("\n", ' ') txt.gsub!("\r", ' ') # Superscripts and subscripts are turned into ^{...} and _{...} # where the {} are omitted for single characters txt.gsub!(/<sup>(.*?)<\/sup>/, '^{\1}') txt.gsub!(/<sub>(.*?)<\/sub>/, '_{\1}') txt.gsub!(/(^|_)\{(.)\}/, '\1\2') # List items are converted to *). We don't have special support for # nested or ordered lists. txt.gsub!(/<li>/, ' *) ') # All other tags are just removed txt.gsub!(/<[^>]+>/, '') # Convert HTML entities. We do it now to be able to handle stuff # such as   txt = Utils.decode_html_entities(txt) # Keep unbreakable spaces or conver them to plain spaces? case val = opts[:nbsp] when :space, ' ' txt.gsub!([160].pack('U'), ' ') else warning "unknown :nbsp option #{val} passed to ircify_html" if val end # Remove double formatting options, since they only waste bytes txt.gsub!(/#{Bold}(\s*)#{Bold}/, '\1') txt.gsub!(/#{Underline}(\s*)#{Underline}/, '\1') # Simplify whitespace that appears on both sides of a formatting option txt.gsub!(/\s+(#{Bold}|#{Underline})\s+/, ' \1') txt.sub!(/\s+(#{Bold}|#{Underline})\z/, '\1') txt.sub!(/\A(#{Bold}|#{Underline})\s+/, '\1') # And finally whitespace is squeezed txt.gsub!(/\s+/, ' ') txt.strip! if opts[:limit] && txt.size > opts[:limit] txt = txt.slice(0, opts[:limit]) + "#{Reverse}...#{Reverse}" end # Decode entities and strip whitespace return txt end | 
#ircify_html!(opts = {}) ⇒ Object
As above, but modify the receiver
| 324 325 326 327 328 | # File 'lib/rbot/core/utils/extends.rb', line 324 def ircify_html!(opts={}) old_hash = self.hash replace self.ircify_html(opts) return self unless self.hash == old_hash end | 
#ircify_html_title ⇒ Object
This method returns the IRC-formatted version of an HTML title found in the string
| 349 350 351 | # File 'lib/rbot/core/utils/extends.rb', line 349 def ircify_html_title self.get_html_title.ircify_html rescue nil end | 
#riphtml ⇒ Object
This method will strip all HTML crud from the receiver
| 332 333 334 | # File 'lib/rbot/core/utils/extends.rb', line 332 def riphtml self.gsub(/<[^>]+>/, '').gsub(/&/,'&').gsub(/"/,'"').gsub(/</,'<').gsub(/>/,'>').gsub(/&ellip;/,'...').gsub(/'/, "'").gsub("\n",'') end | 
#to_irc_auth_command ⇒ Object
Returns an Irc::Bot::Auth::Comand from the receiver
| 119 120 121 | # File 'lib/rbot/botuser.rb', line 119 def to_irc_auth_command Irc::Bot::Auth::Command.new(self) end | 
#to_irc_casemap ⇒ Object
This method returns the Irc::Casemap whose name is the receiver
| 275 276 277 278 279 280 281 282 283 | # File 'lib/rbot/irc.rb', line 275 def to_irc_casemap begin Irc::Casemap.get(self) rescue # raise TypeError, "Unkown Irc::Casemap #{self.inspect}" error "Unkown Irc::Casemap #{self.inspect} requested, defaulting to rfc1459" Irc::Casemap.get('rfc1459') end end | 
#to_irc_channel(opts = {}) ⇒ Object
We keep extending String, this time adding a method that converts a String into an Irc::Channel object
| 1513 1514 1515 | # File 'lib/rbot/irc.rb', line 1513 def to_irc_channel(opts={}) Irc::Channel.new(self, opts) end | 
#to_irc_channel_topic ⇒ Object
Returns an Irc::Channel::Topic with self as text
| 1318 1319 1320 | # File 'lib/rbot/irc.rb', line 1318 def to_irc_channel_topic Irc::Channel::Topic.new(self) end | 
#to_irc_netmask(opts = {}) ⇒ Object
We keep extending String, this time adding a method that converts a String into an Irc::Netmask object
| 915 916 917 | # File 'lib/rbot/irc.rb', line 915 def to_irc_netmask(opts={}) Irc::Netmask.new(self, opts) end | 
#to_irc_regexp ⇒ Object
This method is used to convert the receiver into a Regular Expression that matches according to the IRC glob syntax
| 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 | # File 'lib/rbot/irc.rb', line 339 def to_irc_regexp regmask = Regexp.escape(self) regmask.gsub!(/(\\\\)?\\[*?]/) { |m| case m when /\\(\\[*?])/ $1 when /\\\*/ '.*' when /\\\?/ '.' else raise "Unexpected match #{m} when converting #{self}" end } Regexp.new("^#{regmask}$") end | 
#to_irc_user(opts = {}) ⇒ Object
We keep extending String, this time adding a method that converts a String into an Irc::User object
| 1108 1109 1110 | # File 'lib/rbot/irc.rb', line 1108 def to_irc_user(opts={}) Irc::User.new(self, opts) end | 
#wrap_nonempty(pre, post, opts = {}) ⇒ Object
This method is used to wrap a nonempty String by adding the prefix and postfix
| 355 356 357 358 359 360 361 | # File 'lib/rbot/core/utils/extends.rb', line 355 def wrap_nonempty(pre, post, opts={}) if self.empty? String.new else "#{pre}#{self}#{post}" end end |