Module: Polytexnic::Preprocessor::Polytex
- Includes:
- Literal
- Included in:
- Polytexnic::Preprocessor
- Defined in:
- lib/polytexnic/preprocessors/polytex.rb
Constant Summary
Constants included from Literal
Literal::CODE_INCLUSION_REGEX, Literal::LANG_REGEX
Instance Method Summary collapse
-
#cache_code_environments ⇒ Object
Caches Markdown code environments.
-
#cache_image_locations(text, cache) ⇒ Object
Caches the locations of images to be passed through the pipeline.
-
#cache_latex_literal(markdown, cache) ⇒ Object
Caches literal LaTeX environments.
-
#cache_math(text, cache) ⇒ Object
Caches math.
-
#cache_raw_latex(markdown, cache) ⇒ Object
Caches raw LaTeX commands to be passed through the pipeline.
-
#convert_code_inclusion(text, cache) ⇒ Object
Adds support for <<(path/to/code) inclusion.
-
#convert_includegraphics(text) ⇒ Object
Converts includegraphics to image.
-
#convert_tt(text) ⇒ Object
Converts … to kode… This effectively converts ‘inline code`, which kramdown sets as inline code, to PolyTeX’s native kode command, which in turns allows inline code to be separately styled.
-
#restore_hashed_content(text, cache) ⇒ Object
Restores raw code from the cache.
-
#restore_math(text, cache) ⇒ Object
Restores the Markdown math.
-
#to_polytex ⇒ Object
Converts Markdown to PolyTeX.
Methods included from Literal
#cache_display_inline_math, #cache_display_math, #cache_inline_math, #cache_literal, #cache_literal_environments, #cache_unicode, #code_salt, #element, #equation_element, #hyperrefs, #literal_types, #math_environments
Instance Method Details
#cache_code_environments ⇒ Object
Caches Markdown code environments. Included are indented environments, Leanpub-style indented environments, and GitHub-style code fencing.
136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 |
# File 'lib/polytexnic/preprocessors/polytex.rb', line 136 def cache_code_environments output = [] lines = @source.split("\n") indentation = ' ' * 4 while (line = lines.shift) if line =~ /\{lang="(.*?)"\}/ language = $1 code = [] while (line = lines.shift) && line.match(/^#{indentation}(.*)$/) do code << $1 end code = code.join("\n") key = digest(code) code_cache[key] = [code, language] output << key output << line elsif line =~ /^```\s*$/ # basic code fences while (line = lines.shift) && !line.match(/^```\s*$/) output << indentation + line end output << "\n" elsif line =~ /^```(\w+)(,\s*options:.*)?$/ # highlighted fences language = $1 = $2 code = [] while (line = lines.shift) && !line.match(/^```\s*$/) do code << line end code = code.join("\n") data = [code, language, false, ] key = digest(data.join("--")) code_cache[key] = data output << key else output << line end end output.join("\n") end |
#cache_image_locations(text, cache) ⇒ Object
Caches the locations of images to be passed through the pipeline. This works around a Kramdown bug, which fails to convert images properly when their location includes a URL.
115 116 117 118 119 120 121 122 |
# File 'lib/polytexnic/preprocessors/polytex.rb', line 115 def cache_image_locations(text, cache) # Matches '' text.gsub!(/^\s*(!\[.*?\])\((.*?)\)/) do key = digest($2) cache[key] = $2 "\n#{$1}(#{key})" end end |
#cache_latex_literal(markdown, cache) ⇒ Object
Caches literal LaTeX environments.
65 66 67 68 69 70 71 72 73 74 75 76 77 |
# File 'lib/polytexnic/preprocessors/polytex.rb', line 65 def cache_latex_literal(markdown, cache) Polytexnic::Literal.literal_types.each do |literal| regex = /(\\begin\{#{Regexp.escape(literal)}\} .*? \\end\{#{Regexp.escape(literal)}\}) /xm markdown.gsub!(regex) do key = digest($1) cache[key] = $1 key end end end |
#cache_math(text, cache) ⇒ Object
Caches math. Leanpub uses the notation $$…/$$ for both inline and block math, with the only difference being the presences of newlines:
{$$} x^2 {/$$} % inline
and
{$$}
x^2 % block
{/$$}
I personally hate this notation and convention, so we also support LaTeX-style ( x ) and [ x^2 - 2 = 0 ] notation.
202 203 204 205 206 207 208 209 210 211 212 213 |
# File 'lib/polytexnic/preprocessors/polytex.rb', line 202 def cache_math(text, cache) text.gsub!(/(?:\{\$\$\}\n(.*?)\n\{\/\$\$\}|\\\[(.*?)\\\])/) do key = digest($1 || $2) cache[[:block, key]] = $1 || $2 key end text.gsub!(/(?:\{\$\$\}(.*?)\{\/\$\$\}|\\\((.*?)\\\))/) do key = digest($1 || $2) cache[[:inline, key]] = $1 || $2 key end end |
#cache_raw_latex(markdown, cache) ⇒ Object
Caches raw LaTeX commands to be passed through the pipeline.
80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 |
# File 'lib/polytexnic/preprocessors/polytex.rb', line 80 def cache_raw_latex(markdown, cache) command_regex = /( ^[ \t]*\\\w+.*\}[ \t]*$ # Command on line with arg | ~\\ref\{.*?\} # reference with a tie | ~\\eqref\{.*?\} # eq reference with a tie | \\[^\s]+\{.*?\} # command with one arg | \\\w+ # normal command | \\- # hyphenation | \\[ %&$\#@] # space or special character ) /x markdown.gsub!(command_regex) do content = $1 puts content.inspect if debug? key = digest(content) cache[key] = content if content =~ /\{table\}|\\caption\{/ # Pad tables & captions with newlines for kramdown compatibility. "\n#{key}\n" else key end end end |
#convert_code_inclusion(text, cache) ⇒ Object
Adds support for <<(path/to/code) inclusion.
56 57 58 59 60 61 62 |
# File 'lib/polytexnic/preprocessors/polytex.rb', line 56 def convert_code_inclusion(text, cache) text.gsub!(/^\s*(<<\(.*?\))/) do key = digest($1) cache[key] = "%= #{$1}" # reduce to a previously solved case key end end |
#convert_includegraphics(text) ⇒ Object
Converts includegraphics to image. The reason is that raw includegraphics is almost always too wide in the PDF. Instead, we use the custom-defined image command, which is specifically designed to fix this issue.
180 181 182 |
# File 'lib/polytexnic/preprocessors/polytex.rb', line 180 def convert_includegraphics(text) text.gsub!('\includegraphics', '\image') end |
#convert_tt(text) ⇒ Object
Converts … to kode… This effectively converts ‘inline code`, which kramdown sets as inline code, to PolyTeX’s native kode command, which in turns allows inline code to be separately styled.
188 189 190 |
# File 'lib/polytexnic/preprocessors/polytex.rb', line 188 def convert_tt(text) text.gsub!(/\{\\tt (.*?)\}/, '\kode{\1}') end |
#restore_hashed_content(text, cache) ⇒ Object
Restores raw code from the cache
125 126 127 128 129 130 131 |
# File 'lib/polytexnic/preprocessors/polytex.rb', line 125 def restore_hashed_content(text, cache) cache.each do |key, value| # Because of the way backslashes get interpolated, we need to add # some extra ones to cover all the cases of hashed LaTeX. text.gsub!(key, value.gsub(/\\/, '\\\\\\')) end end |
#restore_math(text, cache) ⇒ Object
Restores the Markdown math. This is easy because we’re running everything through our LaTeX pipeline.
218 219 220 221 222 223 224 225 226 227 228 229 230 |
# File 'lib/polytexnic/preprocessors/polytex.rb', line 218 def restore_math(text, cache) cache.each do |(kind, key), value| case kind when :inline open = '\(' close = '\)' when :block open = '\[' + "\n" close = "\n" + '\]' end text.gsub!(key, open + value + close) end end |
#to_polytex ⇒ Object
Converts Markdown to PolyTeX. We adopt a unified approach: rather than convert “Markdown” (I use the term loosely*) directly to HTML, we convert it to PolyTeX and then run everything through the PolyTeX pipeline. Happily, kramdown comes equipped with a ‘to_latex` method that does most of the heavy lifting. The ouput isn’t as clean as that produced by Pandoc (our previous choice), but it comes with significant advantages: (1) It’s written in Ruby, available as a gem, so its use eliminates an external dependency. (2) It’s the foundation for the “Markdown” interpreter used by Leanpub, so by using it ourselves we ensure greater compatibility with Leanpub books.
-
<rant>The number of mutually incompatible markup languages going
by the name “Markdown” is truly mind-boggling. Most of them add things to John Gruber’s original Markdown language in an ever-expanding attempt to bolt on the functionality needed to write longer documents. At this point, I fear that “Markdown” has become little more than a marketing term.</rant>
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
# File 'lib/polytexnic/preprocessors/polytex.rb', line 25 def to_polytex require 'kramdown' cache = {} math_cache = {} cleaned_markdown = cache_code_environments puts cleaned_markdown if debug? cleaned_markdown.tap do |markdown| convert_code_inclusion(markdown, cache) cache_latex_literal(markdown, cache) cache_raw_latex(markdown, cache) cache_image_locations(markdown, cache) puts markdown if debug? cache_math(markdown, math_cache) end puts cleaned_markdown if debug? # Override the header ordering, which starts with 'section' by default. lh = 'chapter,section,subsection,subsubsection,paragraph,subparagraph' kramdown = Kramdown::Document.new(cleaned_markdown, latex_headers: lh) puts kramdown.inspect if debug? puts kramdown.to_html if debug? puts kramdown.to_latex if debug? @source = kramdown.to_latex.tap do |polytex| remove_comments(polytex) convert_includegraphics(polytex) convert_tt(polytex) restore_math(polytex, math_cache) restore_hashed_content(polytex, cache) end end |