Class: Gitlab::UrlSanitizer
- Inherits:
-
Object
- Object
- Gitlab::UrlSanitizer
- Includes:
- Gitlab::Utils::StrongMemoize
- Defined in:
- lib/gitlab/url_sanitizer.rb
Constant Summary collapse
- ALLOWED_SCHEMES =
%w[http https ssh git].freeze
- ALLOWED_WEB_SCHEMES =
%w[http https].freeze
- SCHEMIFIED_SCHEME =
'glschemelessuri'
- SCHEMIFY_PLACEHOLDER =
"#{SCHEMIFIED_SCHEME}://".freeze
- URI_REGEXP =
URI::DEFAULT_PARSER.make_regexp will only match URLs with schemes or relative URLs. This section will match schemeless URIs with userinfo e.g. user:[email protected] but will not match scp-style URIs e.g. user@server:path/to/file)
The userinfo part is very loose compared to URI’s implementation so we also match non-escaped userinfo e.g foo:[email protected] which should be encoded as foo:b%[email protected]
%r{ (?: #{URI::DEFAULT_PARSER.make_regexp(ALLOWED_SCHEMES)} | (?:(?:(?!@)[%#{URI::REGEXP::PATTERN::UNRESERVED}#{URI::REGEXP::PATTERN::RESERVED}])+(?:@)) (?# negative lookahead ensures this isn't an SCP-style URL: [host]:[rel_path|abs_path] server:path/to/file) (?!#{URI::REGEXP::PATTERN::HOST}:(?:#{URI::REGEXP::PATTERN::REL_PATH}|#{URI::REGEXP::PATTERN::ABS_PATH})) #{URI::REGEXP::PATTERN::HOSTPORT} ) }x
- MASKED_USERINFO_REGEX =
This expression is derived from ‘URI::REGEXP::PATTERN::USERINFO` but with the addition of `and `` in the list of allowed characters to account for the possibility of the userinfo portion of a URL containing masked segments. e.g. myuser:masked_password@masked_domain.com/masked_hook
%r{(?:[\\-_.!~*'()a-zA-Z\d;:&=+$,{}]|%[a-fA-F\d]{2})*}
Class Method Summary collapse
- .sanitize(content) ⇒ Object
-
.sanitize_masked_url(url) ⇒ Object
The url associated with records like ‘WebHookLog` may contain masked portions represented by paired curly brackets in the URL.
- .valid?(url, allowed_schemes: ALLOWED_SCHEMES) ⇒ Boolean
- .valid_web?(url) ⇒ Boolean
Instance Method Summary collapse
- #credentials ⇒ Object
- #full_url ⇒ Object
-
#initialize(url, credentials: nil) ⇒ UrlSanitizer
constructor
A new instance of UrlSanitizer.
- #masked_url ⇒ Object
- #sanitized_url ⇒ Object
- #user ⇒ Object
Constructor Details
#initialize(url, credentials: nil) ⇒ UrlSanitizer
Returns a new instance of UrlSanitizer.
67 68 69 70 71 72 73 74 |
# File 'lib/gitlab/url_sanitizer.rb', line 67 def initialize(url, credentials: nil) %i[user password].each do |symbol| credentials[symbol] = credentials[symbol].presence if credentials&.key?(symbol) end @credentials = credentials @url = parse_url(url) end |
Class Method Details
.sanitize(content) ⇒ Object
36 37 38 39 40 41 42 |
# File 'lib/gitlab/url_sanitizer.rb', line 36 def self.sanitize(content) content.gsub(URI_REGEXP) do |url| new(url).masked_url rescue Addressable::URI::InvalidURIError '' end end |
.sanitize_masked_url(url) ⇒ Object
The url associated with records like ‘WebHookLog` may contain masked portions represented by paired curly brackets in the URL. As this prohibits straightforward parsing of the URL, we can use a variation of the existing USERINFO regex for these cases.
63 64 65 |
# File 'lib/gitlab/url_sanitizer.rb', line 63 def self.sanitize_masked_url(url) url.gsub(%r{//#{MASKED_USERINFO_REGEX}@}o, '//*****:*****@') end |
.valid?(url, allowed_schemes: ALLOWED_SCHEMES) ⇒ Boolean
44 45 46 47 48 49 50 51 52 53 |
# File 'lib/gitlab/url_sanitizer.rb', line 44 def self.valid?(url, allowed_schemes: ALLOWED_SCHEMES) return false unless url.present? return false unless url.is_a?(String) uri = Addressable::URI.parse(url.strip) allowed_schemes.include?(uri.scheme) rescue Addressable::URI::InvalidURIError false end |
.valid_web?(url) ⇒ Boolean
55 56 57 |
# File 'lib/gitlab/url_sanitizer.rb', line 55 def self.valid_web?(url) valid?(url, allowed_schemes: ALLOWED_WEB_SCHEMES) end |
Instance Method Details
#credentials ⇒ Object
76 77 78 |
# File 'lib/gitlab/url_sanitizer.rb', line 76 def credentials @credentials ||= { user: @url.user.presence, password: @url.password.presence } end |
#full_url ⇒ Object
100 101 102 103 104 105 106 107 |
# File 'lib/gitlab/url_sanitizer.rb', line 100 def full_url return reverse_schemify(@url.to_s) unless valid_credentials? url = @url.dup url.password = encode_percent(credentials[:password]) if credentials[:password].present? url.user = encode_percent(credentials[:user]) if credentials[:user].present? reverse_schemify(url.to_s) end |
#masked_url ⇒ Object
92 93 94 95 96 97 |
# File 'lib/gitlab/url_sanitizer.rb', line 92 def masked_url url = @url.dup url.password = "*****" if url.password.present? url.user = "*****" if url.user.present? reverse_schemify(url.to_s) end |
#sanitized_url ⇒ Object
84 85 86 87 88 89 |
# File 'lib/gitlab/url_sanitizer.rb', line 84 def sanitized_url safe_url = @url.dup safe_url.password = nil safe_url.user = nil reverse_schemify(safe_url.to_s) end |
#user ⇒ Object
80 81 82 |
# File 'lib/gitlab/url_sanitizer.rb', line 80 def user credentials[:user] end |