Module: BELParser::Quoting

Overview

The Quoting module implements quoting rules consistent with BEL and BEL Script. Double quotes are used to group a string together which may contain whitespace or special characters.

A value can either be an identifier or a string value. An identifier can only include the characters [0-9A-Za-z_]. A string value is necessary when at least one of [^0-9A-Za-z_] exists in the value.

Uses:

BEL: The BEL parameters must be an identifier or string value.

BEL Script: BEL parameters, document property values, and annotation values must be an identifier or string value.

Constant Summary collapse

Keywords =

Declares BEL Script keywords that cause problems with the OpenBEL Framework parser.

%w(SET DEFINE a g p r m).freeze
KeywordMatcher =

Regular expression that matches one of Keywords.

Regexp.compile(/^(#{Keywords.join('|')})$/)
NonWordMatcher =

Regular expression that matches on any non-word character.

Regexp.compile(/[^0-9a-zA-Z_]/)
StrictQuotedMatcher =

Regular expression that matches a value surrounded by unescaped double quotes.

Regexp.compile(/\A".*?(?<!\\)"\Z/m)
LenientQuotedMatcher =

Regular expression that matches a value surrounded by double quotes that may be escaped.

Regexp.compile(/\A".*?"\Z/m)
QuoteNotEscapedMatcher =

Regular expression that matches double quotes that are not escaped.

Regexp.compile(/(?<!\\)"/m)

Instance Method Summary collapse

Instance Method Details

#identifier_value?(value) ⇒ Boolean

Returns whether the value represents an identifier. An identifier consists of only word characters (e.g. [0-9A-Za-z_]).

Examples:

Returns true when representing an identifier.

identifier_value?("AKT1_HUMAN")
# => true

Returns false when not representing an identifier.

identifier_value?("apoptotic process")
# => false

Returns:

  • (Boolean)

    true if value is an identifier, false if value is not an identifier



149
150
151
152
153
154
# File 'lib/bel_parser/quoting.rb', line 149

def identifier_value?(value)
  string = value.to_s
  [NonWordMatcher, KeywordMatcher].none? do |matcher|
    matcher.match string
  end
end

#quote(value) ⇒ String

Returns value surrounded by double quotes. This method is idempotent so value will only be quoted once regardless of how may times the method is called on it.

Examples:

Quoting a BEL parameter.

quote("apoptotic process")
# => "\"apoptotic process\""

Escaping quotes within a value.

quote("vesicle fusion with \"Golgi apparatus\"")
# => "\"vesicle fusion with \\\"Golgi apparatus\\\"\""

Returns:

  • (String)

    value surrounded by double quotes



52
53
54
55
56
57
# File 'lib/bel_parser/quoting.rb', line 52

def quote(value)
  string   = value.to_s
  unquoted = unquote(string)
  escaped  = unquoted.gsub(QuoteNotEscapedMatcher, '\\"')
  %("#{escaped}")
end

#quote_if_needed(value) ⇒ String

Returns value with quoting applied only if necessary. A value consisting of only word character (e.g. [0-9A-Za-z_]) does not need quoting. A value consisting of at least one non-word character (e.g. [^0-9A-Za-z_]) will requiring quoting.

Examples:

Quotes added when value includes spaces.

quote_if_needed("apoptotic process")
# => "\"apoptotic process\""

Quotes added when value includes double quote.

quote_if_needed("vesicle fusion with \"Golgi apparatus\"")
# => "\"vesicle fusion with \\\"Golgi apparatus\\\"\""

No quotes necessary for identifier.

quote_if_needed("AKT1_HUMAN")
# => "AKT1_HUMAN"

Returns:

  • (String)

    original value or quoted value



95
96
97
98
99
100
101
# File 'lib/bel_parser/quoting.rb', line 95

def quote_if_needed(value)
  if string_value?(value)
    quote(value)
  else
    value.to_s
  end
end

#quoted?(value) ⇒ Boolean

Returns whether the value is surrounded by double quotes.

Examples:

Returns true when value is quoted.

quoted?("\"vesicle fusion with \"Golgi apparatus\"")
# => true

Returns false when value is not quoted.

quoted?("apoptotic process")
# => false

Returns:

  • (Boolean)

    true if value is quoted, false if value is not quoted



115
116
117
118
# File 'lib/bel_parser/quoting.rb', line 115

def quoted?(value)
  string = value.to_s
  (string =~ LenientQuotedMatcher) != nil
end

#string_value?(value) ⇒ Boolean

Returns whether the value represents a string value. A string value consists of at least one non-word character (e.g. [^0-9A-Za-z_]).

Examples:

Returns true when representing a string value.

string_value?("apoptotic process")
# => true

Returns false when not representing a string value.

string_value?("AKT1_HUMAN")
# => false

Returns:

  • (Boolean)

    true if value is a string value, false if value is not a string value



170
171
172
173
174
175
# File 'lib/bel_parser/quoting.rb', line 170

def string_value?(value)
  string = value.to_s
  [NonWordMatcher, KeywordMatcher].any? do |matcher|
    matcher.match string
  end
end

#unquote(value) ⇒ String

Returns value with surrounded quotes removed.

Examples:

Unquoting a BEL parameter.

unquote("\"apoptotic process\"")
# => "apoptotic process"

Escaped quotes are preserved.

unquote("\"vesicle fusion with \"Golgi apparatus\"\"")

Returns:

  • (String)

    value with surrounding double quotes removed



69
70
71
72
73
74
75
76
# File 'lib/bel_parser/quoting.rb', line 69

def unquote(value)
  string = value.to_s
  if string =~ StrictQuotedMatcher
    string[1...-1]
  else
    string
  end
end

#unquoted?(value) ⇒ Boolean

Returns whether the value is not surrounded by double quotes.

Examples:

Returns true when value is not quoted.

unquoted?("apoptotic process")
# => true

Returns false when value is quoted.

unquoted?("\"vesicle fusion with \"Golgi apparatus\"")
# => false

Returns:

  • (Boolean)

    true if value is not quoted, false if value is quoted



132
133
134
# File 'lib/bel_parser/quoting.rb', line 132

def unquoted?(value)
  !quoted?(value)
end