Class: TagTreeScanner

Inherits:
Object
  • Object
show all
Defined in:
lib/tagtreescanner.rb

Overview

Overview

The TagTreeScanner class provides a generic framework for creating a nested hierarchy of tags and text (like XML or HTML) by parsing text. An example use (and the reason it was written) is to convert a wiki markup syntax into HTML.

See the README.txt.html file for examples and more information.

Defined Under Namespace

Classes: Tag, TagFactory, TextNode

Constant Summary collapse

VERSION =
"0.8.1"

Class Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(string_to_parse) ⇒ TagTreeScanner

Scans through string_to_parse and builds a tree of tags based on the regular expressions and rules set by the TagFactory instances present in @tag_genres.

After parsing the tree, call #to_xml or #to_html to retrieve a string representation.



604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
# File 'lib/tagtreescanner.rb', line 604

def initialize( string_to_parse )
  current = @root = self.class.root_factory.create
  tag_genres = self.class.tag_genres
  text_match = self.class.text_match

  ss = StringScanner.new( string_to_parse )
  while !ss.eos?
    # Keep popping off the current tag until we get to the root,
    # as long as the end criteria is met
    while ( current != @root ) && (!current.close_requires_bol? || ss.bol?) && ss.scan( current.close_match )
      current = current.parent_tag || @root
    end

    # No point in continuing if closing out tags consumed the rest of the string
    break if ss.eos?

    # Look for a tag to open
    if factories = tag_genres[ current.allowed_genre ]
      tag = nil
      factories.each{ |factory|
        if tag = factory.match( ss, self )
          current.append_child( tag )
          current = tag unless tag.autoclose?
          break
        end
      }
      #start at the top of the loop if we found one
      next if tag
    end

    # Couldn't find a valid tag at this spot
    # so we need to eat some characters
    consumed = ss.scan( text_match )
    current << consumed if current.allows_text?
  end
end

Class Attribute Details

.root_factoryObject

Returns the value of attribute root_factory.



567
568
569
# File 'lib/tagtreescanner.rb', line 567

def root_factory
  @root_factory
end

.tag_genresObject

Returns the value of attribute tag_genres.



567
568
569
# File 'lib/tagtreescanner.rb', line 567

def tag_genres
  @tag_genres
end

.text_matchObject

Returns the value of attribute text_match.



567
568
569
# File 'lib/tagtreescanner.rb', line 567

def text_match
  @text_match
end

Class Method Details

.inherited(child_class) ⇒ Object

When a class inherits from TagTreeScanner, defaults are set for @tag_genres, @root_factory and @text_match



677
678
679
680
681
# File 'lib/tagtreescanner.rb', line 677

def self.inherited( child_class ) #:nodoc:
  child_class.tag_genres = @tag_genres
  child_class.root_factory = @root_factory
  child_class.text_match = @text_match
end

Instance Method Details

#inspectObject

Returns a hierarchical representation of the entire tag tree



670
671
672
# File 'lib/tagtreescanner.rb', line 670

def inspect #:nodoc:
  @root.to_hier
end

#tagsObject

Returns an array of all root-level tags found



659
660
661
# File 'lib/tagtreescanner.rb', line 659

def tags
  @root.child_tags
end

#tags_by_name(name) ⇒ Object

Returns an array of all tags in the tree whose Tag#name matches the supplied name.



665
666
667
# File 'lib/tagtreescanner.rb', line 665

def tags_by_name( name )
  @root.tags_by_type( name )
end

#to_htmlObject

Returns an HTML representation of the tag tree.

This is the same as the #to_xml method except that empty tags use an explicit close tag, e.g. <div></div> versus <div />



645
646
647
# File 'lib/tagtreescanner.rb', line 645

def to_html
  @root.child_tags.inject(''){ |out, tag| out << tag.to_html }
end

#to_xmlObject

Returns an XML representation of the tag tree.

This method is the same as the #to_html method except that empty tags do not use an explicit close tag, e.g. <div /> versus <div></div>



654
655
656
# File 'lib/tagtreescanner.rb', line 654

def to_xml
  @root.child_tags.inject(''){ |out, tag| out << tag.to_xml }
end