Class: Gitlab::UntrustedRegexp
- Inherits:
-
Object
- Object
- Gitlab::UntrustedRegexp
- Defined in:
- lib/gitlab/untrusted_regexp.rb,
lib/gitlab/untrusted_regexp/ruby_syntax.rb
Overview
An untrusted regular expression is any regexp containing patterns sourced from user input.
Ruby’s built-in regular expression library allows patterns which complete in exponential time, permitting denial-of-service attacks.
Not all regular expression features are available in untrusted regexes, and there is a strict limit on total execution time. See the RE2 documentation at github.com/google/re2/wiki/Syntax for more details.
This class doesn’t change any instance variables, which allows it to be frozen and setup in constants.
Defined Under Namespace
Classes: RubySyntax
Constant Summary collapse
- BACKSLASH_R =
recreate Ruby’s R metacharacter ruby-doc.org/3.2.2/Regexp.html#class-Regexp-label-Character+Classes
'(\n|\v|\f|\r|\x{0085}|\x{2028}|\x{2029}|\r\n)'
Class Method Summary collapse
-
.with_fallback(pattern, multiline: false) ⇒ Object
Handles regular expressions with the preferred RE2 library where possible via UntustedRegex.
Instance Method Summary collapse
- #==(other) ⇒ Object
-
#extract_named_group(name, match) ⇒ Object
#scan returns an array of the groups captured, rather than MatchData.
-
#initialize(pattern, multiline: false) ⇒ UntrustedRegexp
constructor
A new instance of UntrustedRegexp.
- #match(text) ⇒ Object
- #match?(text) ⇒ Boolean
- #replace(text, rewrite) ⇒ Object
- #replace_all(text, rewrite) ⇒ Object
-
#replace_gsub(text) ⇒ Object
There is no built-in replace with block support (like ‘gsub`).
- #scan(text) ⇒ Object
Constructor Details
#initialize(pattern, multiline: false) ⇒ UntrustedRegexp
Returns a new instance of UntrustedRegexp.
25 26 27 28 29 30 31 32 33 34 |
# File 'lib/gitlab/untrusted_regexp.rb', line 25 def initialize(pattern, multiline: false) if multiline pattern = "(?m)#{pattern}" end @regexp = RE2::Regexp.new(pattern, log_errors: false) @scan_regexp = initialize_scan_regexp raise RegexpError, regexp.error unless regexp.ok? end |
Class Method Details
.with_fallback(pattern, multiline: false) ⇒ Object
Handles regular expressions with the preferred RE2 library where possible via UntustedRegex. Falls back to Ruby’s built-in regular expression library when the syntax would be invalid in RE2.
One difference between these is ‘(?m)` multi-line mode. Ruby regex enables this by default, but also handles `^` and `$` differently. See: www.regular-expressions.info/modifiers.html
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 |
# File 'lib/gitlab/untrusted_regexp.rb', line 101 def self.with_fallback(pattern, multiline: false) UntrustedRegexp.new(pattern, multiline: multiline) rescue RegexpError raise if Feature.enabled?(:disable_unsafe_regexp) if Feature.enabled?(:ci_unsafe_regexp_logger, type: :ops) Gitlab::AppJsonLogger.info( class: self.name, regexp: pattern.to_s, fabricated: 'unsafe ruby regexp' ) end Regexp.new(pattern) end |
Instance Method Details
#==(other) ⇒ Object
90 91 92 |
# File 'lib/gitlab/untrusted_regexp.rb', line 90 def ==(other) self.source == other.source end |
#extract_named_group(name, match) ⇒ Object
#scan returns an array of the groups captured, rather than MatchData. Use this to give the capture group name and grab the proper value
81 82 83 84 85 86 87 88 |
# File 'lib/gitlab/untrusted_regexp.rb', line 81 def extract_named_group(name, match) return unless match match_position = regexp.named_capturing_groups[name.to_s] raise RegexpError, "Invalid named capture group: #{name}" unless match_position match[match_position - 1] end |
#match(text) ⇒ Object
67 68 69 |
# File 'lib/gitlab/untrusted_regexp.rb', line 67 def match(text) scan_regexp.match(text) end |
#match?(text) ⇒ Boolean
71 72 73 |
# File 'lib/gitlab/untrusted_regexp.rb', line 71 def match?(text) text.present? && scan(text).present? end |
#replace(text, rewrite) ⇒ Object
75 76 77 |
# File 'lib/gitlab/untrusted_regexp.rb', line 75 def replace(text, rewrite) RE2.Replace(text, regexp, rewrite) end |
#replace_all(text, rewrite) ⇒ Object
36 37 38 |
# File 'lib/gitlab/untrusted_regexp.rb', line 36 def replace_all(text, rewrite) RE2.GlobalReplace(text, regexp, rewrite) end |
#replace_gsub(text) ⇒ Object
There is no built-in replace with block support (like ‘gsub`). We can accomplish the same thing by parsing and rebuilding the string with the substitutions.
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
# File 'lib/gitlab/untrusted_regexp.rb', line 42 def replace_gsub(text) new_text = +'' remainder = text matched = match(remainder) until matched.nil? || matched.to_a.compact.empty? partitioned = remainder.partition(matched.to_s) new_text << partitioned.first remainder = partitioned.last new_text << yield(matched) matched = match(remainder) end new_text << remainder end |
#scan(text) ⇒ Object
61 62 63 64 65 |
# File 'lib/gitlab/untrusted_regexp.rb', line 61 def scan(text) matches = scan_regexp.scan(text).to_a matches.map!(&:first) if regexp.number_of_capturing_groups == 0 matches end |