Module: Gitlab::PathRegex

Extended by:
PathRegex
Included in:
PathRegex
Defined in:
lib/gitlab/path_regex.rb

Constant Summary collapse

TOP_LEVEL_ROUTES =

All routes that appear on the top level must be listed here. This will make sure that groups cannot be created with these names as these routes would be masked by the paths already in place.

Example:

 /api/api-project

the path `api` shouldn't be allowed because it would be masked by `api/*`
%w[
  -
  .well-known
  404.html
  422.html
  500.html
  502.html
  503.html
  admin
  api
  apple-touch-icon.png
  assets
  dashboard
  deploy.html
  explore
  favicon.ico
  favicon.png
  groups
  health_check
  help
  import
  jwt
  login
  o
  oauth
  profile
  projects
  public
  robots.txt
  s
  search
  sitemap
  sitemap.xml
  sitemap.xml.gz
  slash-command-logo.png
  snippets
  unsubscribes
  uploads
  users
  v2
].freeze
PROJECT_WILDCARD_ROUTES =

NOTE: Do not add new items to this list unless necessary as this will cause conflicts with existing namespaced routes for groups or projects. See docs.gitlab.com/ee/development/routing.html#project-routes

This list should contain all words following ‘/*namespace_id/:project_id` in routes that contain a second wildcard.

Example:

/*namespace_id/:project_id/badges/*ref/build

If ‘badges` was allowed as a project/group name, we would not be able to access the `badges` route for those projects:

Consider a namespace with path ‘foo/bar` and a project called `badges`. The route to the build badge would then be `/foo/bar/badges/badges/master/build.svg`

When accessing this path the route would be matched to the ‘badges` path with the following params:

- namespace_id: `foo`
- project_id: `bar`
- ref: `badges/master`

Failing to find the project, this would result in a 404.

By rejecting ‘badges` the router can count on the fact that `badges` will be preceded by the `namespace/project`.

%w[
  -
  badges
  blame
  blob
  builds
  commits
  create
  create_dir
  edit
  environments/folders
  files
  find_file
  gitlab-lfs/objects
  info/lfs/objects
  new
  preview
  raw
  refs
  tree
  update
  wikis
].freeze
GROUP_ROUTES =

NOTE: Do not add new items to this list unless necessary as this will cause conflicts with existing namespaced routes for groups or projects. See docs.gitlab.com/ee/development/routing.html#group-routes

These are all the paths that follow ‘/groups/*id/ or `/groups/*group_id` We need to reject these because we have a `/groups/*id` page that is the same as the `/*id`.

If we would allow a subgroup to be created with the name ‘activity` then this group would not be accessible through `/groups/parent/activity` since this would map to the activity-page of its parent.

%w[
  -
].freeze
ILLEGAL_PROJECT_PATH_WORDS =
PROJECT_WILDCARD_ROUTES
ILLEGAL_GROUP_PATH_WORDS =
(PROJECT_WILDCARD_ROUTES | GROUP_ROUTES).freeze
ILLEGAL_ORGANIZATION_PATH_WORDS =
(TOP_LEVEL_ROUTES | PROJECT_WILDCARD_ROUTES | GROUP_ROUTES).freeze
PATH_START_CHAR =

The namespace regex is used in JavaScript to validate usernames in the “Register” form. However, Javascript does not support the negative lookbehind assertion (?<!) that disallows usernames ending in ‘.git` and `.atom`. Since this is a non-trivial problem to solve in Javascript (heavily complicate the regex, modify view code to allow non-regex validations, etc), `NAMESPACE_FORMAT_REGEX_JS` serves as a Javascript-compatible version of `NAMESPACE_FORMAT_REGEX`, with the negative lookbehind assertion removed. This means that the client-side validation will pass for usernames ending in `.atom` and `.git`, but will be caught by the server-side validation.

'[a-zA-Z0-9_\.]'
PATH_REGEX_STR =
PATH_START_CHAR + '[a-zA-Z0-9_\-\.]' + "{0,#{Namespace::URL_MAX_LENGTH - 1}}"
NAMESPACE_FORMAT_REGEX_JS =
PATH_REGEX_STR + '[a-zA-Z0-9_\-]|[a-zA-Z0-9_]'
NO_SUFFIX_REGEX =
/(?<!\.git|\.atom)/
NAMESPACE_FORMAT_REGEX =
/(?:#{NAMESPACE_FORMAT_REGEX_JS})#{NO_SUFFIX_REGEX}/
PROJECT_PATH_FORMAT_REGEX =
/(?:#{PATH_REGEX_STR})#{NO_SUFFIX_REGEX}/
FULL_NAMESPACE_FORMAT_REGEX =
%r{(#{NAMESPACE_FORMAT_REGEX}/){,#{Namespace::NUMBER_OF_ANCESTORS_ALLOWED}}#{NAMESPACE_FORMAT_REGEX}}
ORGANIZATION_PATH_REGEX_STR =
'[a-zA-Z0-9_][a-zA-Z0-9_\-]' + "{0,#{Namespace::URL_MAX_LENGTH - 1}}"
ORGANIZATION_PATH_FORMAT_REGEX =
/(?:#{ORGANIZATION_PATH_REGEX_STR})/

Instance Method Summary collapse

Instance Method Details

#archive_formats_regexObject



251
252
253
254
# File 'lib/gitlab/path_regex.rb', line 251

def archive_formats_regex
  #                           |zip|tar|    tar.gz    |         tar.bz2         |
  @archive_formats_regex ||= /(zip|tar|tar\.gz|tgz|gz|tar\.bz2|tbz|tbz2|tb2|bz2)/
end

#container_image_blob_sha_regexObject



287
288
289
# File 'lib/gitlab/path_regex.rb', line 287

def container_image_blob_sha_regex
  @container_image_blob_sha_regex ||= %r{[\w+.-]+:?\w+}
end

#container_image_regexObject



283
284
285
# File 'lib/gitlab/path_regex.rb', line 283

def container_image_regex
  @container_image_regex ||= %r{([\w\.-]+\/){0,4}[\w\.-]+}
end

#dependency_proxy_route_regexObject



291
292
293
# File 'lib/gitlab/path_regex.rb', line 291

def dependency_proxy_route_regex
  @dependency_proxy_route_regex ||= %r{\A/v2/#{full_namespace_route_regex}/dependency_proxy/containers/#{container_image_regex}/(manifests|blobs)/#{container_image_blob_sha_regex}\z}
end

#full_namespace_path_regexObject



209
210
211
# File 'lib/gitlab/path_regex.rb', line 209

def full_namespace_path_regex
  @full_namespace_path_regex ||= %r{\A#{full_namespace_route_regex}/\z}
end

#full_namespace_route_regexObject



167
168
169
170
171
172
173
174
175
176
177
178
179
180
# File 'lib/gitlab/path_regex.rb', line 167

def full_namespace_route_regex
  @full_namespace_route_regex ||= begin
    illegal_words = Regexp.new(Regexp.union(ILLEGAL_GROUP_PATH_WORDS).source, Regexp::IGNORECASE)

    single_line_regexp %r{
      #{root_namespace_route_regex}
      (?:
        /
        (?!#{illegal_words}/)
        #{NAMESPACE_FORMAT_REGEX}
      )*
    }x
  end
end

#full_project_git_path_regexObject



221
222
223
# File 'lib/gitlab/path_regex.rb', line 221

def full_project_git_path_regex
  @full_project_git_path_regex ||= %r{\A\/?(?<namespace_path>#{full_namespace_route_regex})\/(?<project_path>#{project_route_regex})\.git\z}
end

#full_project_path_regexObject



217
218
219
# File 'lib/gitlab/path_regex.rb', line 217

def full_project_path_regex
  @full_project_path_regex ||= %r{\A#{full_namespace_route_regex}/#{project_route_regex}/\z}
end

#full_snippets_repository_path_regexObject



279
280
281
# File 'lib/gitlab/path_regex.rb', line 279

def full_snippets_repository_path_regex
  %r{\A(#{personal_snippet_repository_path_regex}|#{project_snippet_repository_path_regex})\z}
end

#git_reference_regexObject



256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
# File 'lib/gitlab/path_regex.rb', line 256

def git_reference_regex
  # Valid git ref regex, see:
  # https://www.kernel.org/pub/software/scm/git/docs/git-check-ref-format.html

  @git_reference_regex ||= single_line_regexp %r{
    (?!
       (?# doesn't begins with)
       \/|                    (?# rule #6)
       (?# doesn't contain)
       .*(?:
          [\/.]\.|            (?# rule #1,3)
          \/\/|               (?# rule #6)
          @\{|                (?# rule #8)
          \\                  (?# rule #9)
       )
    )
    [^\000-\040\177~^:?*\[]+  (?# rule #4-5)
    (?# doesn't end with)
    (?<!\.lock)               (?# rule #1)
    (?<![\/.])                (?# rule #6-7)
  }x
end

#namespace_format_messageObject



237
238
239
240
# File 'lib/gitlab/path_regex.rb', line 237

def namespace_format_message
  "can contain only letters, digits, '_', '-' and '.'. " \
  "Cannot start with '-' or end in '.', '.git' or '.atom'." \
end

#namespace_format_regexObject



233
234
235
# File 'lib/gitlab/path_regex.rb', line 233

def namespace_format_regex
  @namespace_format_regex ||= /\A#{NAMESPACE_FORMAT_REGEX}\z/o
end

#organization_format_messageObject



229
230
231
# File 'lib/gitlab/path_regex.rb', line 229

def organization_format_message
  "can contain only letters, digits, '_' and '-'. Cannot start with '-'."
end

#organization_format_regexObject



225
226
227
# File 'lib/gitlab/path_regex.rb', line 225

def organization_format_regex
  @organization_format_regex ||= /\A#{ORGANIZATION_PATH_FORMAT_REGEX}\z/o
end

#organization_path_regexObject



213
214
215
# File 'lib/gitlab/path_regex.rb', line 213

def organization_path_regex
  @organization_path_regex ||= %r{\A#{organization_route_regex}/\z}
end

#organization_route_regexObject



145
146
147
148
149
150
151
152
153
154
# File 'lib/gitlab/path_regex.rb', line 145

def organization_route_regex
  @organization_route_regex ||= begin
    illegal_words = Regexp.new(Regexp.union(ILLEGAL_ORGANIZATION_PATH_WORDS).source, Regexp::IGNORECASE)

    single_line_regexp %r{
      (?!(#{illegal_words})/)
      #{ORGANIZATION_PATH_FORMAT_REGEX}
    }x
  end
end

#project_path_format_messageObject



246
247
248
249
# File 'lib/gitlab/path_regex.rb', line 246

def project_path_format_message
  "can contain only letters, digits, '_', '-' and '.'. " \
  "Cannot start with '-', end in '.git' or end in '.atom'" \
end

#project_path_format_regexObject



242
243
244
# File 'lib/gitlab/path_regex.rb', line 242

def project_path_format_regex
  @project_path_format_regex ||= /\A#{PROJECT_PATH_FORMAT_REGEX}\z/o
end

#project_route_regexObject



182
183
184
185
186
187
188
189
190
191
# File 'lib/gitlab/path_regex.rb', line 182

def project_route_regex
  @project_route_regex ||= begin
    illegal_words = Regexp.new(Regexp.union(ILLEGAL_PROJECT_PATH_WORDS).source, Regexp::IGNORECASE)

    single_line_regexp %r{
      (?!(#{illegal_words})/)
      #{PROJECT_PATH_FORMAT_REGEX}
    }x
  end
end

#repository_git_lfs_route_regexObject



201
202
203
# File 'lib/gitlab/path_regex.rb', line 201

def repository_git_lfs_route_regex
  @repository_git_lfs_route_regex ||= %r{#{repository_git_route_regex}\/(info\/lfs|gitlab-lfs)\/}
end

#repository_git_route_regexObject



197
198
199
# File 'lib/gitlab/path_regex.rb', line 197

def repository_git_route_regex
  @repository_git_route_regex ||= /#{repository_route_regex}\.git/
end

#repository_route_regexObject



193
194
195
# File 'lib/gitlab/path_regex.rb', line 193

def repository_route_regex
  @repository_route_regex ||= /(#{full_namespace_route_regex}|#{personal_snippet_repository_path_regex})\.*/
end

#repository_wiki_git_route_regexObject



205
206
207
# File 'lib/gitlab/path_regex.rb', line 205

def repository_wiki_git_route_regex
  @repository_wiki_git_route_regex ||= /#{full_namespace_route_regex}\.*\.wiki\.git/
end

#root_namespace_route_regexObject



156
157
158
159
160
161
162
163
164
165
# File 'lib/gitlab/path_regex.rb', line 156

def root_namespace_route_regex
  @root_namespace_route_regex ||= begin
    illegal_words = Regexp.new(Regexp.union(TOP_LEVEL_ROUTES).source, Regexp::IGNORECASE)

    single_line_regexp %r{
      (?!(#{illegal_words})/)
      #{NAMESPACE_FORMAT_REGEX}
    }x
  end
end