Class: Banzai::Filter::AutolinkFilter
- Inherits:
-
HTML::Pipeline::Filter
- Object
- HTML::Pipeline::Filter
- Banzai::Filter::AutolinkFilter
- Includes:
- ActionView::Helpers::TagHelper, Gitlab::Utils::SanitizeNodeLink
- Defined in:
- lib/banzai/filter/autolink_filter.rb
Overview
HTML Filter for auto-linking URLs in HTML.
Based on HTML::Pipeline::AutolinkFilter
Note that our CommonMark parser, ‘commonmarker` (using the autolink extension) handles standard autolinking, like http/https. We detect additional schemes (smb, rdar, etc).
Context options:
:autolink - Boolean, skips all processing done by this filter when false
:link_attr - Hash of attributes for the generated links
Constant Summary collapse
- LINK_PATTERN =
Pattern to match text that should be autolinked.
A URI scheme begins with a letter and may contain letters, numbers, plus, period and hyphen. Schemes are case-insensitive but we’re being picky here and allowing only lowercase for autolinks.
See en.wikipedia.org/wiki/URI_scheme
The negative lookbehind ensures that users can paste a URL followed by punctuation without those characters being included in the generated link. It matches the behaviour of Rinku 2.0.1: github.com/vmg/rinku/blob/v2.0.1/ext/rinku/autolink.c#L65
Rubular: rubular.com/r/nrL3r9yUiq Note that it’s not possible to use Gitlab::UntrustedRegexp for LINK_PATTERN, as ‘(?<!` is unsupported in `re2`, see github.com/google/re2/wiki/Syntax
%r{([a-z][a-z0-9\+\.-]+://[^\s>]+)(?<!\?|!|\.|,|:)}
- ENTITY_UNTRUSTED =
'((?:&[\w#]+;)+)\z'
- ENTITY_UNTRUSTED_REGEX =
Gitlab::UntrustedRegexp.new(ENTITY_UNTRUSTED, multiline: false)
- IGNORE_PARENTS =
Text matching LINK_PATTERN inside these elements will not be linked
%w[a code kbd pre script style].to_set
- TEXT_QUERY =
The XPath query to use for finding text nodes to parse.
%(descendant-or-self::text()[ not(#{IGNORE_PARENTS.map { |p| "ancestor::#{p}" }.join(' or ')}) and contains(., '://') ])
- PUNCTUATION_PAIRS =
{ "'" => "'", '"' => '"', ')' => '(', ']' => '[', '}' => '{' }.freeze
Constants included from Gitlab::Utils::SanitizeNodeLink
Gitlab::Utils::SanitizeNodeLink::ATTRS_TO_SANITIZE, Gitlab::Utils::SanitizeNodeLink::UNSAFE_PROTOCOLS
Instance Method Summary collapse
Methods included from Gitlab::Utils::SanitizeNodeLink
#remove_unsafe_links, #safe_protocol?, #sanitize_unsafe_links
Instance Method Details
#call ⇒ Object
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
# File 'lib/banzai/filter/autolink_filter.rb', line 61 def call return doc if context[:autolink] == false doc.xpath(TEXT_QUERY).each do |node| content = node.to_html next unless content.match(LINK_PATTERN) html = autolink_filter(content) next if html == content node.replace(html) end doc end |