Class: Orgmode::RegexpHelper
- Inherits:
-
Object
- Object
- Orgmode::RegexpHelper
- Defined in:
- lib/org-ruby/regexp_helper.rb
Overview
Summary
This class contains helper routines to deal with the Regexp “black magic” you need to properly parse org-mode files.
Key methods
-
Use
rewrite_emphasis
to replace org-mode emphasis strings (e.g., /italic/) with the suitable markup for the output. -
Use
rewrite_links
to get a chance to rewrite all org-mode links with suitable markup for the output. -
Use
rewrite_images
to rewrite all inline image links with suitable markup for the output.
Instance Attribute Summary collapse
-
#body_regexp ⇒ Object
readonly
Returns the value of attribute body_regexp.
-
#border_forbidden ⇒ Object
readonly
Returns the value of attribute border_forbidden.
-
#markers ⇒ Object
readonly
Returns the value of attribute markers.
-
#org_emphasis_regexp ⇒ Object
readonly
Returns the value of attribute org_emphasis_regexp.
-
#post_emphasis ⇒ Object
readonly
Returns the value of attribute post_emphasis.
-
#pre_emphasis ⇒ Object
readonly
EMPHASIS.
Instance Method Summary collapse
-
#initialize ⇒ RegexpHelper
constructor
A new instance of RegexpHelper.
-
#match_all(str) ⇒ Object
Finds all emphasis matches in a string.
-
#rewrite_emphasis(str) ⇒ Object
Compute replacements for all matching emphasized phrases.
-
#rewrite_footnote(str) ⇒ Object
rewrite footnotes.
-
#rewrite_images(str) ⇒ Object
Rewrites all of the inline image tags.
-
#rewrite_links(str) ⇒ Object
Summary.
-
#rewrite_subp(str) ⇒ Object
rewrite subscript and superscript (_foo and ^bar).
Constructor Details
#initialize ⇒ RegexpHelper
Returns a new instance of RegexpHelper.
53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
# File 'lib/org-ruby/regexp_helper.rb', line 53 def initialize # Set up the emphasis regular expression. @pre_emphasis = " \t\\('\"" @post_emphasis = "- \t.,:!?;'\"\\)" @border_forbidden = " \t\r\n,\"'" @body_regexp = ".*?" @markers = "*/_=~+" @logger = Logger.new(STDERR) @logger.level = Logger::WARN build_org_emphasis_regexp build_org_link_regexp @org_subp_regexp = /([_^])\{(.*?)\}/ @org_footnote_regexp = /\[fn:(.+?)(:(.*?))?\]/ end |
Instance Attribute Details
#body_regexp ⇒ Object (readonly)
Returns the value of attribute body_regexp.
48 49 50 |
# File 'lib/org-ruby/regexp_helper.rb', line 48 def body_regexp @body_regexp end |
#border_forbidden ⇒ Object (readonly)
Returns the value of attribute border_forbidden.
47 48 49 |
# File 'lib/org-ruby/regexp_helper.rb', line 47 def border_forbidden @border_forbidden end |
#markers ⇒ Object (readonly)
Returns the value of attribute markers.
49 50 51 |
# File 'lib/org-ruby/regexp_helper.rb', line 49 def markers @markers end |
#org_emphasis_regexp ⇒ Object (readonly)
Returns the value of attribute org_emphasis_regexp.
51 52 53 |
# File 'lib/org-ruby/regexp_helper.rb', line 51 def org_emphasis_regexp @org_emphasis_regexp end |
#post_emphasis ⇒ Object (readonly)
Returns the value of attribute post_emphasis.
46 47 48 |
# File 'lib/org-ruby/regexp_helper.rb', line 46 def post_emphasis @post_emphasis end |
#pre_emphasis ⇒ Object (readonly)
EMPHASIS
I figure it’s best to stick as closely to the elisp implementation as possible for emphasis. org.el defines the regular expression that is used to apply “emphasis” (in my terminology, inline formatting instead of block formatting). Here’s the documentation from org.el.
Terminology: In an emphasis string like “ *strong word* ”, we call the initial space PREMATCH, the final space POSTMATCH, the stars MARKERS, “s” and “d” are BORDER characters and “trong wor” is the body. The different components in this variable specify what is allowed/forbidden in each part:
pre Chars allowed as prematch. Line beginning allowed, too. post Chars allowed as postmatch. Line end will be allowed too. border The chars forbidden as border characters. body-regexp A regexp like "." to match a body character. Don’t use
non-shy groups here, and don't allow newline here.
newline The maximum number of newlines allowed in an emphasis exp.
I currently don’t use newline
because I’ve thrown this information away by this point in the code. TODO – revisit?
45 46 47 |
# File 'lib/org-ruby/regexp_helper.rb', line 45 def pre_emphasis @pre_emphasis end |
Instance Method Details
#match_all(str) ⇒ Object
Finds all emphasis matches in a string. Supply a block that will get the marker and body as parameters.
70 71 72 73 74 |
# File 'lib/org-ruby/regexp_helper.rb', line 70 def match_all(str) str.scan(@org_emphasis_regexp) do |match| yield $2, $3 end end |
#rewrite_emphasis(str) ⇒ Object
Compute replacements for all matching emphasized phrases. Supply a block that will get the marker and body as parameters; return the replacement string from your block.
Example
re = RegexpHelper.new
result = re.rewrite_emphasis("*bold*, /italic/, =code=") do |marker, body|
"<#{map[marker]}>#{body}</#{map[marker]}>"
end
In this example, the block body will get called three times:
-
Marker: “*”, body: “bold”
-
Marker: “/”, body: “italic”
-
Marker: “=”, body: “code”
The return from this block is a string that will be used to replace “bold”, “/italic/”, and “=code=”, respectively. (Clearly this sample string will use HTML-like syntax, assuming map
is defined appropriately.)
97 98 99 100 101 102 |
# File 'lib/org-ruby/regexp_helper.rb', line 97 def rewrite_emphasis(str) str.gsub(@org_emphasis_regexp) do |match| inner = yield $2, $3 "#{$1}#{inner}#{$4}" end end |
#rewrite_footnote(str) ⇒ Object
rewrite footnotes
112 113 114 115 116 |
# File 'lib/org-ruby/regexp_helper.rb', line 112 def rewrite_footnote(str) # :yields: name, definition or nil str.gsub(@org_footnote_regexp) do |match| yield $1, $3 end end |
#rewrite_images(str) ⇒ Object
Rewrites all of the inline image tags.
159 160 161 162 163 |
# File 'lib/org-ruby/regexp_helper.rb', line 159 def rewrite_images(str) # :yields: image_link str.gsub(@org_img_regexp) do |match| yield $1 end end |
#rewrite_links(str) ⇒ Object
Summary
Rewrite org-mode links in a string to markup suitable to the output format.
Usage
Give this a block that expect the link and optional friendly text. Return how that link should get formatted.
Example
re = RegexpHelper.new
result = re.rewrite_links("[[http://www.bing.com]] and [[http://www.hotmail.com][Hotmail]]") do |link, text}
text ||= link
"<a href=\"#{link}\">#{text}</a>"
end
In this example, the block body will get called two times. In the first instance, text
will be nil (the org-mode markup gives no friendly text for the link http://www.bing.com
. In the second instance, the block will get text of Hotmail and the link http://www.hotmail.com
. In both cases, the block returns an HTML-style link, and that is how things will get recorded in result
.
143 144 145 146 147 148 149 150 151 152 153 154 155 156 |
# File 'lib/org-ruby/regexp_helper.rb', line 143 def rewrite_links(str) # :yields: link, text i = str.gsub(@org_link_regexp) do |match| yield $1, nil end if str =~ @org_angle_link_text_regexp i.gsub(@org_angle_link_text_regexp) do |match| yield "#{$2}:#{$3}", nil end else i.gsub(@org_link_text_regexp) do |match| yield $1, $2 end end end |
#rewrite_subp(str) ⇒ Object
rewrite subscript and superscript (_foo and ^bar)
105 106 107 108 109 |
# File 'lib/org-ruby/regexp_helper.rb', line 105 def rewrite_subp(str) # :yields: type ("_" for subscript and "^" for superscript), text str.gsub(@org_subp_regexp) do |match| yield $1, $2 end end |