Class: Orgmode::RegexpHelper

Inherits:
Object
  • Object
show all
Defined in:
lib/org-ruby/regexp_helper.rb

Overview

Summary

This class contains helper routines to deal with the Regexp “black magic” you need to properly parse org-mode files.

Key methods

  • Use rewrite_emphasis to replace org-mode emphasis strings (e.g., /italic/) with the suitable markup for the output.

  • Use rewrite_links to get a chance to rewrite all org-mode links with suitable markup for the output.

  • Use rewrite_images to rewrite all inline image links with suitable markup for the output.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeRegexpHelper

Returns a new instance of RegexpHelper.



53
54
55
56
57
58
59
60
61
62
63
64
65
66
# File 'lib/org-ruby/regexp_helper.rb', line 53

def initialize
  # Set up the emphasis regular expression.
  @pre_emphasis = " \t\\('\""
  @post_emphasis = "- \t.,:!?;'\"\\)"
  @border_forbidden = " \t\r\n,\"'"
  @body_regexp = ".*?"
  @markers = "*/_=~+"
  @logger = Logger.new(STDERR)
  @logger.level = Logger::WARN
  build_org_emphasis_regexp
  build_org_link_regexp
  @org_subp_regexp = /([_^])\{(.*?)\}/
  @org_footnote_regexp = /\[fn:(.+?)(:(.*?))?\]/
end

Instance Attribute Details

#body_regexpObject (readonly)

Returns the value of attribute body_regexp.



48
49
50
# File 'lib/org-ruby/regexp_helper.rb', line 48

def body_regexp
  @body_regexp
end

#border_forbiddenObject (readonly)

Returns the value of attribute border_forbidden.



47
48
49
# File 'lib/org-ruby/regexp_helper.rb', line 47

def border_forbidden
  @border_forbidden
end

#markersObject (readonly)

Returns the value of attribute markers.



49
50
51
# File 'lib/org-ruby/regexp_helper.rb', line 49

def markers
  @markers
end

#org_emphasis_regexpObject (readonly)

Returns the value of attribute org_emphasis_regexp.



51
52
53
# File 'lib/org-ruby/regexp_helper.rb', line 51

def org_emphasis_regexp
  @org_emphasis_regexp
end

#post_emphasisObject (readonly)

Returns the value of attribute post_emphasis.



46
47
48
# File 'lib/org-ruby/regexp_helper.rb', line 46

def post_emphasis
  @post_emphasis
end

#pre_emphasisObject (readonly)

EMPHASIS

I figure it’s best to stick as closely to the elisp implementation as possible for emphasis. org.el defines the regular expression that is used to apply “emphasis” (in my terminology, inline formatting instead of block formatting). Here’s the documentation from org.el.

Terminology: In an emphasis string like “ *strong word* ”, we call the initial space PREMATCH, the final space POSTMATCH, the stars MARKERS, “s” and “d” are BORDER characters and “trong wor” is the body. The different components in this variable specify what is allowed/forbidden in each part:

pre Chars allowed as prematch. Line beginning allowed, too. post Chars allowed as postmatch. Line end will be allowed too. border The chars forbidden as border characters. body-regexp A regexp like "." to match a body character. Don’t use

non-shy groups here, and don't allow newline here.

newline The maximum number of newlines allowed in an emphasis exp.

I currently don’t use newline because I’ve thrown this information away by this point in the code. TODO – revisit?



45
46
47
# File 'lib/org-ruby/regexp_helper.rb', line 45

def pre_emphasis
  @pre_emphasis
end

Instance Method Details

#match_all(str) ⇒ Object

Finds all emphasis matches in a string. Supply a block that will get the marker and body as parameters.



70
71
72
73
74
# File 'lib/org-ruby/regexp_helper.rb', line 70

def match_all(str)
  str.scan(@org_emphasis_regexp) do |match|
    yield $2, $3
  end
end

#rewrite_emphasis(str) ⇒ Object

Compute replacements for all matching emphasized phrases. Supply a block that will get the marker and body as parameters; return the replacement string from your block.

Example

re = RegexpHelper.new
result = re.rewrite_emphasis("*bold*, /italic/, =code=") do |marker, body|
    "<#{map[marker]}>#{body}</#{map[marker]}>"
end

In this example, the block body will get called three times:

  1. Marker: “*”, body: “bold”

  2. Marker: “/”, body: “italic”

  3. Marker: “=”, body: “code”

The return from this block is a string that will be used to replace “bold”, “/italic/”, and “=code=”, respectively. (Clearly this sample string will use HTML-like syntax, assuming map is defined appropriately.)



97
98
99
100
101
102
# File 'lib/org-ruby/regexp_helper.rb', line 97

def rewrite_emphasis(str)
  str.gsub(@org_emphasis_regexp) do |match|
    inner = yield $2, $3
    "#{$1}#{inner}#{$4}"
  end
end

#rewrite_footnote(str) ⇒ Object

rewrite footnotes



112
113
114
115
116
# File 'lib/org-ruby/regexp_helper.rb', line 112

def rewrite_footnote(str) # :yields: name, definition or nil
  str.gsub(@org_footnote_regexp) do |match|
    yield $1, $3
  end
end

#rewrite_images(str) ⇒ Object

Rewrites all of the inline image tags.



159
160
161
162
163
# File 'lib/org-ruby/regexp_helper.rb', line 159

def rewrite_images(str) #  :yields: image_link
  str.gsub(@org_img_regexp) do |match|
    yield $1
  end
end

Summary

Rewrite org-mode links in a string to markup suitable to the output format.

Usage

Give this a block that expect the link and optional friendly text. Return how that link should get formatted.

Example

re = RegexpHelper.new
result = re.rewrite_links("[[http://www.bing.com]] and [[http://www.hotmail.com][Hotmail]]") do |link, text}
    text ||= link
    "<a href=\"#{link}\">#{text}</a>"
 end

In this example, the block body will get called two times. In the first instance, text will be nil (the org-mode markup gives no friendly text for the link http://www.bing.com. In the second instance, the block will get text of Hotmail and the link http://www.hotmail.com. In both cases, the block returns an HTML-style link, and that is how things will get recorded in result.



143
144
145
146
147
148
149
150
151
152
153
154
155
156
# File 'lib/org-ruby/regexp_helper.rb', line 143

def rewrite_links(str) #  :yields: link, text
  i = str.gsub(@org_link_regexp) do |match|
    yield $1, nil
  end
  if str =~ @org_angle_link_text_regexp
    i.gsub(@org_angle_link_text_regexp) do |match|
      yield "#{$2}:#{$3}", nil
    end
  else
    i.gsub(@org_link_text_regexp) do |match|
      yield $1, $2
    end
  end
end

#rewrite_subp(str) ⇒ Object

rewrite subscript and superscript (_foo and ^bar)



105
106
107
108
109
# File 'lib/org-ruby/regexp_helper.rb', line 105

def rewrite_subp(str) # :yields: type ("_" for subscript and "^" for superscript), text
  str.gsub(@org_subp_regexp) do |match|
    yield $1, $2
  end
end