Class: Bbcode::Tokenizer
- Inherits:
-
Object
- Object
- Bbcode::Tokenizer
- Defined in:
- lib/bbcode/tokenizer.rb
Overview
Scans a string and converts it to a stream of bbcode tokens.
Constant Summary collapse
- BBCODE_TAG_PATTERN =
/\[(\/?)([a-z0-9_-]*)(\s*=?(?:(?:\s*(?:(?:[a-z0-9_-]+)|(?<=\=))\s*[:=]\s*)?(?:"[^"\\]*(?:\\[\s\S][^"\\]*)*"|'[^'\\]*(?:\\[\s\S][^'\\]*)*'|[^\]\s,]+|(?<=,)(?=\s*,))\s*,?\s*)*)\]/i- ATTRIBUTE_PATTERN =
/(?:\s*(?:([a-z0-9_-]+)|^)\s*[:=]\s*)?("[^"\\]*(?:\\[\s\S][^"\\]*)*"|'[^'\\]*(?:\\[\s\S][^'\\]*)*'|[^\]\s,]+|(?<=,)(?=\s*,))\s*,?/i- UNESCAPE_PATTERN =
/\\(.)/
Instance Method Summary collapse
- #parse_attributes_string(attributes_string) ⇒ Object
-
#tokenize(document, handler) ⇒ Object
Parses the document as BBCode-formatted text and calls block with bbcode events.
Instance Method Details
#parse_attributes_string(attributes_string) ⇒ Object
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
# File 'lib/bbcode/tokenizer.rb', line 8 def parse_attributes_string( attributes_string ) attrs = HashWithIndifferentAccess.new return attrs if attributes_string.nil? next_anonymous_key = -1 attributes_string.scan ATTRIBUTE_PATTERN do |key, value| skip_value = key.blank? && value.blank? key = next_anonymous_key+=1 if key.blank? unless skip_value value = value[1...-1].gsub UNESCAPE_PATTERN, "\\1" if value[0] == value[-1] && ["'", '"'].include?(value[0]) attrs[key] = value end end return attrs end |
#tokenize(document, handler) ⇒ Object
Parses the document as BBCode-formatted text and calls block with bbcode events.
The handler will have the following methods called:
-
.text text A text-event with an additional parameter containing the actual text.
-
.start_element element_name, element_arguments An element-event with 2 additional parameters: The element name as a symbol and the element attributes as a hash. This events indicate the start of the element.
-
.end_element element_name An element-event indicating the end of an element. Optionally, the element_name is added as a parameter. If no parameter is present, it is assumed to be the last started element.
Note that :start_element and :end_element are not guaranteed to be called evenly or in the “correct” order. You must match correct start- and end tags yourself to create the elements.
Also note that :text events are not guaranteed to match the whole text. In some cases, the text might be separated to multiple :text events, even though there are no nodes in between.
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
# File 'lib/bbcode/tokenizer.rb', line 47 def tokenize(document, handler) while !(match = BBCODE_TAG_PATTERN.match(document)).nil? offset = match.begin(0) elem_source = match[0] handler.text document[0...offset] unless offset == 0 elem_is_closing_tag = match[1]=='/' elem_name = (match[2].length > 0 && match[2].to_sym) || nil elem_attr_string = (match[3].length > 0 && match[3]) || nil if (elem_is_closing_tag && !elem_attr_string) || (!elem_is_closing_tag && elem_name) if !elem_is_closing_tag handler.start_element elem_name, parse_attributes_string(elem_attr_string), elem_source else handler.end_element elem_name, elem_source end else handler.text elem_source end document = document[(offset+elem_source.length)..-1] end handler.text document unless document.length == 0 end |