Module: Regexp::Irc

Defined in:
lib/rbot/irc.rb,
lib/rbot/core/utils/extends.rb

Overview

We start with some IRC related regular expressions, used to match Irc::User nicks and users and Irc::Channel names

For each of them we define two versions of the regular expression:

  • a generic one, which should match for any server but may turn out to match more than a specific server would accept

  • an RFC-compliant matcher

Constant Summary collapse

CHAN_FIRST =

Channel-name-matching regexps

/[#&+]/
CHAN_SAFE =
/![A-Z0-9]{5}/
CHAN_ANY =
/[^\x00\x07\x0A\x0D ,:]/
GEN_CHAN =
/(?:#{CHAN_FIRST}|#{CHAN_SAFE})#{CHAN_ANY}+/
RFC_CHAN =
/#{CHAN_FIRST}#{CHAN_ANY}{1,49}|#{CHAN_SAFE}#{CHAN_ANY}{1,44}/
SPECIAL_CHAR =

Nick-matching regexps

/[\x5b-\x60\x7b-\x7d]/
NICK_FIRST =
/#{SPECIAL_CHAR}|[[:alpha:]]/
NICK_ANY =
/#{SPECIAL_CHAR}|[[:alnum:]]|-/
GEN_NICK =
/#{NICK_FIRST}#{NICK_ANY}+/
RFC_NICK =
/#{NICK_FIRST}#{NICK_ANY}{0,8}/
USER_CHAR =
/[^\x00\x0a\x0d @]/
GEN_USER =
/#{USER_CHAR}+/
HOSTNAME_COMPONENT =

Host-matching regexps

/[[:alnum:]](?:[[:alnum:]]|-)*[[:alnum:]]*/
HOSTNAME =
/#{HOSTNAME_COMPONENT}(?:\.#{HOSTNAME_COMPONENT})*/
HOSTADDR =
/#{IP_ADDR}|#{IP6_ADDR}/
GEN_HOST =
/#{HOSTNAME}|#{HOSTADDR}/
GEN_HOST_EXT =

Sadly, different networks have different, RFC-breaking ways of cloaking the actualy host address: see above for an example to handle FreeNode. Another example would be Azzurra, wich also inserts a “=” in the cloacked host. So let’s just not care about this and go with the simplest thing:

/\S+/
GEN_USER_ID =

User-matching Regexp

/(#{GEN_NICK})(?:(?:!(#{GEN_USER}))?@(#{GEN_HOST_EXT}))?/
BANG_AT =

Things such has the BIP proxy send invalid nicks in a complete netmask, so we want to match this, rather: this matches either a compliant nick or a a string with a very generic nick, a very generic hostname after an @ sign, and an optional user after a !

/#{GEN_NICK}|\S+?(?:!\S+?)?@\S+?/
CHAN_LIST =

Match a list of channel anmes separated by optional commas, whitespace and optionally the word “and”

Regexp.new_list(GEN_CHAN)
IN_CHAN =

Match “in #channel” or “on #channel” and/or “in private” (optionally shortened to “in pvt”), returning the channel name or the word ‘private’ or ‘pvt’ as capture

/#{IN_ON}\s+(#{GEN_CHAN})|(here)|/
IN_CHAN_PVT =
/#{IN_CHAN}|in\s+(private|pvt)/
IN_CHAN_LIST_SFX =

As above, but with channel lists

Regexp.new_list(/#{GEN_CHAN}|here/, IN_ON)
IN_CHAN_LIST =
/#{IN_ON}\s+#{IN_CHAN_LIST_SFX}|anywhere|everywhere/
IN_CHAN_LIST_PVT_SFX =
Regexp.new_list(/#{GEN_CHAN}|here|private|pvt/, IN_ON)
IN_CHAN_LIST_PVT =
/#{IN_ON}\s+#{IN_CHAN_LIST_PVT_SFX}|anywhere|everywhere/
NICK_LIST =

Match a list of nicknames separated by optional commas, whitespace and optionally the word “and”

Regexp.new_list(GEN_NICK)