Class: Banzai::Filter::MarkdownPreEscapeFilter

Inherits:
HTML::Pipeline::TextFilter
  • Object
show all
Defined in:
lib/banzai/filter/markdown_pre_escape_filter.rb

Overview

In order to allow a user to short-circuit our reference shortcuts (such as # or !), the user should be able to escape them, like #. CommonMark supports this, however it removes all information about what was actually a literal. In order to short-circuit the reference, we must surround backslash escaped ASCII punctuation with a custom sequence. This way CommonMark will properly handle the backslash escaped chars but we will maintain knowledge (the sequence) that it was a literal.

We need to surround the character, not just prefix it. It could get converted into an entity by CommonMark and we wouldn't know how many characters there are. The entire literal needs to be surrounded with a `span` tag, which short-circuits our reference processing.

We can't use a custom HTML tag since we could be initially surrounding text in an href, and then CommonMark will not be able to parse links properly. So we use `cmliteral-` and `-cmliteral`

spec.commonmark.org/0.29/#backslash-escapes

This filter does the initial surrounding, and MarkdownPostEscapeFilter does the conversion into span tags.

Constant Summary collapse

REFERENCE_CHARACTERS =

We just need to target those that are special GitLab references

'@#!$&~%^'
ASCII_PUNCTUATION =
%r{(\\[#{REFERENCE_CHARACTERS}])}.freeze
LITERAL_KEYWORD =
'cmliteral'

Instance Method Summary collapse

Instance Method Details

#callObject


32
33
34
35
36
37
38
39
40
# File 'lib/banzai/filter/markdown_pre_escape_filter.rb', line 32

def call
  @text.gsub(ASCII_PUNCTUATION) do |match|
    # The majority of markdown does not have literals.  If none
    # are found, we can bypass the post filter
    result[:escaped_literals] = true

    "#{LITERAL_KEYWORD}-#{match}-#{LITERAL_KEYWORD}"
  end
end