Class: Gitlab::UntrustedRegexp

Inherits:
Object
  • Object
show all
Defined in:
lib/gitlab/untrusted_regexp.rb,
lib/gitlab/untrusted_regexp/ruby_syntax.rb

Overview

An untrusted regular expression is any regexp containing patterns sourced from user input.

Ruby’s built-in regular expression library allows patterns which complete in exponential time, permitting denial-of-service attacks.

Not all regular expression features are available in untrusted regexes, and there is a strict limit on total execution time. See the RE2 documentation at github.com/google/re2/wiki/Syntax for more details.

This class doesn’t change any instance variables, which allows it to be frozen and setup in constants.

Defined Under Namespace

Classes: RubySyntax

Constant Summary collapse

BACKSLASH_R =
'(\n|\v|\f|\r|\x{0085}|\x{2028}|\x{2029}|\r\n)'

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(pattern, multiline: false) ⇒ UntrustedRegexp

Returns a new instance of UntrustedRegexp.

Raises:

  • (RegexpError)


25
26
27
28
29
30
31
32
33
34
# File 'lib/gitlab/untrusted_regexp.rb', line 25

def initialize(pattern, multiline: false)
  if multiline
    pattern = "(?m)#{pattern}"
  end

  @regexp = RE2::Regexp.new(pattern, log_errors: false)
  @scan_regexp = initialize_scan_regexp

  raise RegexpError, regexp.error unless regexp.ok?
end

Class Method Details

.with_fallback(pattern, multiline: false) ⇒ Object

Handles regular expressions with the preferred RE2 library where possible via UntustedRegex. Falls back to Ruby’s built-in regular expression library when the syntax would be invalid in RE2.

One difference between these is ‘(?m)` multi-line mode. Ruby regex enables this by default, but also handles `^` and `$` differently. See: www.regular-expressions.info/modifiers.html



107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
# File 'lib/gitlab/untrusted_regexp.rb', line 107

def self.with_fallback(pattern, multiline: false)
  UntrustedRegexp.new(pattern, multiline: multiline)
rescue RegexpError
  raise if Feature.enabled?(:disable_unsafe_regexp)

  if Feature.enabled?(:ci_unsafe_regexp_logger, type: :ops)
    Gitlab::AppJsonLogger.info(
      class: self.name,
      regexp: pattern.to_s,
      fabricated: 'unsafe ruby regexp'
    )
  end

  Regexp.new(pattern)
end

Instance Method Details

#==(other) ⇒ Object



96
97
98
# File 'lib/gitlab/untrusted_regexp.rb', line 96

def ==(other)
  self.source == other.source
end

#extract_named_group(name, match) ⇒ Object

#scan returns an array of the groups captured, rather than MatchData. Use this to give the capture group name and grab the proper value

Raises:

  • (RegexpError)


87
88
89
90
91
92
93
94
# File 'lib/gitlab/untrusted_regexp.rb', line 87

def extract_named_group(name, match)
  return unless match

  match_position = regexp.named_capturing_groups[name.to_s]
  raise RegexpError, "Invalid named capture group: #{name}" unless match_position

  match[match_position - 1]
end

#match(text) ⇒ Object



73
74
75
# File 'lib/gitlab/untrusted_regexp.rb', line 73

def match(text)
  scan_regexp.match(text)
end

#match?(text) ⇒ Boolean

Returns:

  • (Boolean)


77
78
79
# File 'lib/gitlab/untrusted_regexp.rb', line 77

def match?(text)
  text.present? && scan(text).present?
end

#replace(text, rewrite) ⇒ Object



81
82
83
# File 'lib/gitlab/untrusted_regexp.rb', line 81

def replace(text, rewrite)
  RE2.Replace(text, regexp, rewrite)
end

#replace_all(text, rewrite) ⇒ Object



36
37
38
# File 'lib/gitlab/untrusted_regexp.rb', line 36

def replace_all(text, rewrite)
  RE2.GlobalReplace(text, regexp, rewrite)
end

#replace_gsub(text, limit: 0) ⇒ Object

There is no built-in replace with block support (like ‘gsub`). We can accomplish the same thing by parsing and rebuilding the string with the substitutions.



42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
# File 'lib/gitlab/untrusted_regexp.rb', line 42

def replace_gsub(text, limit: 0)
  new_text = +''
  remainder = text
  count = 0

  matched = match(remainder)

  until matched.nil? || matched.to_a.compact.empty?
    partitioned = remainder.partition(matched.to_s)
    new_text << partitioned.first
    remainder = partitioned.last

    new_text << yield(matched)

    if limit > 0
      count += 1
      break if count >= limit
    end

    matched = match(remainder)
  end

  new_text << remainder
end

#scan(text) ⇒ Object



67
68
69
70
71
# File 'lib/gitlab/untrusted_regexp.rb', line 67

def scan(text)
  matches = scan_regexp.scan(text).to_a
  matches.map!(&:first) if regexp.number_of_capturing_groups == 0
  matches
end