Class: XML

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/magic_xml.rb,
lib/magic_xml.rb,
lib/magic_xml.rb

Overview

Instance methods (other than those of Enumerable)

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(*args, &blk) ⇒ XML

initialize can be run in many ways

  • XML.new

  • XML.new(:tag_symbol)

  • XML.new(:tag_symbol, attributes)

  • XML.new(:tag_symbol, “children”, “more”, XML.new(…))

  • XML.new(:tag_symbol, attributes, “and”, “children”)

  • XML.new(:tag_symbol) { monadic code }

  • XML.new(:tag_symbol, attributes) { monadic code }

Or even:

  • XML.new(:tag_symbol, “children”) { and some monadic code }

  • XML.new(:tag_symbol, attributes, “children”) { and some monadic code }

But typically you won’t be mixing these two style

Attribute values can will be converted to strings



765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
# File 'lib/magic_xml.rb', line 765

def initialize(*args, &blk)
  @name     = nil
  @attrs    = {}
  @contents = []
  @name = args.shift if args.size != 0
  if args.size != 0 and args[0].is_a? Hash
    args.shift.each{|k,v|
      # Do automatic conversion here
      # This also assures that the hashes are *not* shared
      self[k] = v
    }
  end
  # Expand Arrays passed as arguments
  self << args
  # FIXME: We'd rather not have people say @name = :foo there :-)
  if blk
    instance_eval(&blk)
  end
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method

#method_missing(meth, *args, &blk) ⇒ Object

Define all foo!-methods for monadic interface, so you can write:



945
946
947
948
949
950
951
# File 'lib/magic_xml.rb', line 945

def method_missing(meth, *args, &blk)
  if meth.to_s =~ /^(.*)!$/
    self << XML.new($1.to_sym, *args, &blk)
  else
    real_method_missing(meth, *args, &blk)
  end
end

Instance Attribute Details

#attrsObject

Returns the value of attribute attrs.



748
749
750
# File 'lib/magic_xml.rb', line 748

def attrs
  @attrs
end

#contentsObject

Returns the value of attribute contents.



748
749
750
# File 'lib/magic_xml.rb', line 748

def contents
  @contents
end

#nameObject

Returns the value of attribute name.



748
749
750
# File 'lib/magic_xml.rb', line 748

def name
  @name
end

Class Method Details

.from_file(file) ⇒ Object

Read file and parse



434
435
436
437
# File 'lib/magic_xml.rb', line 434

def self.from_file(file)
  file = File.open(file) if file.is_a? String
  parse(file)
end

.from_url(url) ⇒ Object

Fetch URL and parse Supported: http://…/ https://…/ file:foo.xml string:<foo/>



445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
# File 'lib/magic_xml.rb', line 445

def self.from_url(url)
  if url =~ /^string:(.*)$/m
    parse($1)
  elsif url =~ /^file:(.*)$/m
    from_file($1)
  elsif url =~ /^http(s?):/
    ssl = ($1 == "s")
    # No, seriously - Ruby needs something better than net/http
    # Something that groks basic auth and queries and redirects automatically:
    # HTTP_LIBRARY.get_content("http://username:passwd/u.r.l/?query")
    # URI parsing must go inside the library, client programs
    # should have nothing to do with it

    # net/http is really inconvenient to use here
    u = URI.parse(url)
    # You're not seeing this:
    if u.query then
      path = u.path + "?" + u.query
    else
      path = u.path
    end
    req = Net::HTTP::Get.new(path)
    if u.userinfo
      username, passwd = u.userinfo.split(/:/,2)
      req.basic_auth username, passwd
    end
    if ssl
      # NOTE: You need libopenssl-ruby installed
      # if you want to use HTTPS. Ubuntu is broken
      # as it doesn't provide it in the default packages.
      require 'net/https'
      http = Net::HTTP.new(u.host, u.port)
      http.use_ssl = true
      http.verify_mode = OpenSSL::SSL::VERIFY_NONE
    else
      http = Net::HTTP.new(u.host, u.port)
    end

    res = http.start {|http_conn| http_conn.request(req) }
    # TODO: Throw a more meaningful exception
    parse(res.body)
  else
    raise "URL protocol #{url} not supported (http, https, file, string are supported)"
  end
end

.load(obj) ⇒ Object

Like CDuce load_xml The path can be:

  • file handler

  • URL (a string with :)

  • file name (a string without :)



496
497
498
499
500
501
502
503
504
505
506
# File 'lib/magic_xml.rb', line 496

def self.load(obj)
  if obj.is_a? String
    if obj.include? ":"
      from_url(obj)
    else
      from_file(obj)
    end
  else
    parse(obj)
  end
end

.method_missing(meth, *args, &blk) ⇒ Object

XML.foo! == xml!(:foo) XML.foo == xml(:foo)



425
426
427
428
429
430
431
# File 'lib/magic_xml.rb', line 425

def self.method_missing(meth, *args, &blk)
  if meth.to_s =~ /^(.*)!$/
    xml!($1.to_sym, *args, &blk)
  else
    XML.new(meth, *args, &blk)
  end
end

.parse(stream, options = {}) ⇒ Object

Parse XML using REXML. Available options:

  • :extra_entities => Proc or Hash (default = nil)

  • :remove_pretty_printing => true/false (default = false)

  • :comments => true/false (default = false)

  • :pi => true/false (default = false)

  • :normalize => true/false (default = false) - normalize

  • :multiple_roots => true/false (default=false) - document

    can have any number of roots (instread of one).
    Return all in an array instead of root/nil.
    Also include non-elements (String/PI/Comment) in the return set !!!
    

FIXME: :comments/:pi will break everything if there are comments/PIs outside document root. Now PIs are outside the document root more often than not, so we’re pretty much screwed here.

FIXME: Integrate all kinds of parse, and make them support extra options

FIXME: Benchmark normalize!

FIXME: Benchmark dup-based Enumerable methods

FIXME: Make it possible to include bogus XML_Document superparent,

and to make it support out-of-root PIs/Comments


647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
# File 'lib/magic_xml.rb', line 647

def self.parse(stream, options={})
  extra_entities = options[:extra_entities]

  parser = REXML::Parsers::BaseParser.new stream
  stack = [[]]

  while true
    event = parser.pull
    case event[0]
    when :start_element
      attrs = {}
      event[2].each{|k,v| attrs[k.to_sym] = v.xml_unescape(extra_entities) }
      stack << XML.new(event[1].to_sym, attrs, event[3..-1])
      stack[-2] << stack[-1]
    when :end_element
      stack.pop
    # Needs unescaping
    when :text
      e = event[1].xml_unescape(extra_entities)
      # Either inside root or in multi-root mode
      if stack.size > 1 or options[:multiple_roots]
        stack[-1] << e
      elsif event[1] !~ /\S/
        # Ignore out-of-root whitespace in single-root mode
      else
        raise "Non-whitespace text out of document root (and not in multiroot mode): #{event[1]}"
      end
    # CDATA is already unescaped
    when :cdata
      e = event[1]
      if stack.size > 1 or options[:multiple_roots]
        stack[-1] << e
      else
        raise "CDATA out of the document root"
      end
    when :comment
      next unless options[:comments]
      e = XML_Comment.new(event[1])
      if stack.size > 1 or options[:multiple_roots]
        stack[-1] << e
      else
        # FIXME: Ugly !
        raise "Comments out of the document root"
      end
    when :processing_instruction
      # FIXME: Real PI node
      next unless options[:pi]
      e = XML_PI.new(event[1], event[2])
      if stack.size > 1 or options[:multiple_roots]
        stack[-1] << e
      else
        # FIXME: Ugly !
        raise "Processing instruction out of the document root"
      end
    when :end_document
      break
    when :xmldecl,:start_doctype,:end_doctype,:elementdecl
      # Positivery ignore
    when :externalentity,:entity,:attlistdecl,:notationdecl
      # Ignore ???
      #print "Ignored XML event #{event[0]} when parsing\n"
    else
      # Huh ? What's that ?
      #print "Unknown XML event #{event[0]} when parsing\n"
    end
  end
  roots = stack[0]

  roots.each{|root| root.remove_pretty_printing!} if options[:remove_pretty_printing]
  # :remove_pretty_printing does :normalize anyway
  roots.each{|root| root.normalize!} if options[:normalize]
  if options[:multiple_roots]
    roots
  else
    roots[0]
  end
end

.parse_as_twigs(stream) ⇒ Object

Parse XML in mixed stream/tree mode Basically the idea is that every time we get start element, we ask the block what to do about it. If it wants a tree below it, it should call e.tree If a tree was requested, elements below the current one are not processed. If it wasn’t, they are.

For example:

<foo><bar/></foo><foo2/>
yield <foo> ... </foo>
.complete! called
process <foo2> next

But:

<foo><bar/></foo><foo2/>
yield <foo> ... </foo>
.complete! not called
process <bar> next

FIXME: yielded values are not reusable for now FIXME: make more object-oriented



529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
# File 'lib/magic_xml.rb', line 529

def self.parse_as_twigs(stream)
  parser = REXML::Parsers::BaseParser.new stream
  # We don't really need to keep the stack ;-)
  stack = []
  while true
    event = parser.pull
    case event[0]
    when :start_element
      # Now the evil part evil
      attrs = {}
      event[2].each{|k,v| attrs[k.to_sym] = v.xml_unescape}
      node = XML.new(event[1].to_sym, attrs, *event[3..-1])

      # I can't say it's superelegant
      class <<node
        attr_accessor :do_complete
        def complete!
          if @do_complete
            @do_complete.call
            @do_complete = nil
          end
        end
      end
      node.do_complete = proc{
        parse_subtree(node, parser)
      }

      yield(node)
      if node.do_complete
        stack.push node
        node.do_complete = nil # It's too late, complete! shouldn't do anything now
      end
    when :end_element
      stack.pop
    when :end_document
      return
    else
      # FIXME: Do the right thing.
      # For now, ignore *everything* else
      # This is totally incorrect, user might want to
      # see text, comments and stuff like that anyway
    end
  end
end

.parse_sequence(stream, options = {}) ⇒ Object

Parse a sequence. Equivalent to XML.parse(stream, :multiple_roots => true).



726
727
728
729
730
# File 'lib/magic_xml.rb', line 726

def self.parse_sequence(stream, options={})
  o = options.dup
  o[:multiple_roots] = true
  parse(stream, o)
end

.parse_subtree(start_node, parser) ⇒ Object

Basically it’s a copy of self.parse, ugly …



575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
# File 'lib/magic_xml.rb', line 575

def self.parse_subtree(start_node, parser)
  stack = [start_node]
  res = nil
  while true
    event = parser.pull
    case event[0]
    when :start_element
      attrs = {}
      event[2].each{|k,v| attrs[k.to_sym] = v.xml_unescape}
      stack << XML.new(event[1].to_sym, attrs, *event[3..-1])
      if stack.size == 1
        res = stack[0]
      else
        stack[-2] << stack[-1]
      end
    when :end_element
      stack.pop
      return if stack == []
    # Needs unescaping
    when :text
      # Ignore whitespace
      if stack.size == 0
        next if event[1] !~ /\S/
        raise "Non-whitespace text out of document root"
      end
      stack[-1] << event[1].xml_unescape
    # CDATA is already unescaped
    when :cdata
      if stack.size == 0
        raise "CDATA out of the document root"
      end
      stack[-1] << event[1]
    when :end_document
      raise "Parse error: end_document inside a subtree, tags are not balanced"
    when :xmldecl,:start_doctype,:end_doctype,:elementdecl,:processing_instruction
      # Positivery ignore
    when :comment,:externalentity,:entity,:attlistdecl,:notationdecl
      # Ignore ???
      #print "Ignored XML event #{event[0]} when parsing\n"
    else
      # Huh ? What's that ?
      #print "Unknown XML event #{event[0]} when parsing\n"
    end
  end
  res

end

.renormalize(stream) ⇒ Object

Renormalize a string containing XML document



733
734
735
# File 'lib/magic_xml.rb', line 733

def self.renormalize(stream)
  parse(stream).to_s
end

.renormalize_sequence(stream) ⇒ Object

Renormalize a string containing a sequence of XML documents and strings XMLrenormalize_sequence(“<hello />, <world></world>!”) => “<hello/>, <world/>!”



741
742
743
# File 'lib/magic_xml.rb', line 741

def self.renormalize_sequence(stream)
  parse_sequence(stream).join
end

Instance Method Details

#<<(cnt) ⇒ Object Also known as: add!

Add children. Possible uses:

  • Add single element

self << xml(...)
self << "foo"

Add nothing:

self << nil

Add multiple elements (also works recursively):

self << [a, b, c]
self << [a, [b, c], d]


855
856
857
858
859
860
861
862
863
864
# File 'lib/magic_xml.rb', line 855

def <<(cnt)
  if cnt.nil?
    # skip
  elsif cnt.is_a? Array
    cnt.each{|elem| self << elem}
  else
    @contents << cnt
  end
  self
end

#==(x) ⇒ Object

Equality test, works as if XMLs were normalized, so:

XML.new(:foo, "Hello, ", "world") == XML.new(:foo, "Hello, world")


868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
# File 'lib/magic_xml.rb', line 868

def ==(x)
  return false unless x.is_a? XML
  return false unless name == x.name and attrs == x.attrs
  # Now the hard part, strings can be split in different ways
  # empty string children are possible etc.
  self_i = 0
  othr_i = 0
  while self_i != contents.size or othr_i != x.contents.size
    # Ignore ""s
    if contents[self_i].is_a? String and contents[self_i] == ""
      self_i += 1
      next
    end
    if x.contents[othr_i].is_a? String and x.contents[othr_i] == ""
      othr_i += 1
      next
    end

    # If one is finished and the other contains non-empty elements,
    # they are not equal
    return false if self_i == contents.size or othr_i == x.contents.size

    # Are they both Strings ?
    # Strings can be divided in different ways, and calling normalize!
    # here would be rather expensive, so let's use this complicated
    # algorithm
    if contents[self_i].is_a? String and x.contents[othr_i].is_a? String
      a = contents[self_i]
      b = x.contents[othr_i]
      self_i += 1
      othr_i += 1
      while a != "" or b != ""
        if a == b
          a = ""
          b = ""
        elsif a.size > b.size and a[0, b.size] == b
          a = a[b.size..-1]
          if x.contents[othr_i].is_a? String
            b = x.contents[othr_i]
            othr_i += 1
            next
          end
        elsif b.size > a.size and b[0, a.size] == a
          b = b[a.size..-1]
          if contents[self_i].is_a? String
            a = contents[self_i]
            self_i += 1
            next
          end
        else
          return false
        end
      end
      next
    end

    # OK, so at least one of them is not a String.
    # Hopefully they're either both XMLs or one is an XML and the
    # other is a String. It is also possible that contents contains
    # something illegal, but we aren't catching that,
    # so xml(:foo, Garbage.new) is going to at least equal itself.
    # And we aren't, because xml(:foo, Garbage.new) == xml(:bar, Garbage.new)
    # is going to return an honest false, and incoherent sanity
    # check is worse than no sanity check.
    #
    # Oh yeah, they can be XML_PI or XML_Comment. In such case, this
    # is ok.
    return false unless contents[self_i] == x.contents[othr_i]
    self_i += 1
    othr_i += 1
  end
  return true
end

#=~(pattern) ⇒ Object

~ for a few reasonable patterns



1077
1078
1079
1080
1081
1082
1083
1084
1085
# File 'lib/magic_xml.rb', line 1077

def =~(pattern)
  if pattern.is_a? Symbol
    @name == pattern
  elsif pattern.is_a? Regexp
    text =~ pattern
  else # Hash, Pattern_any, Pattern_all
    pattern === self
  end
end

#[](key) ⇒ Object

Read attributes. Also works with pseudoattributes:

img[:@x] == img.child(:x).text # or nil if there isn't any.


812
813
814
815
816
817
818
819
820
821
822
823
824
# File 'lib/magic_xml.rb', line 812

def [](key)
  if key.to_s[0] == ?@
    tag = key.to_s[1..-1].to_sym
    c = child(tag)
    if c
      c.text
    else
      nil
    end
  else
    @attrs[key]
  end
end

#[]=(key, value) ⇒ Object

Set attributes. Value is automatically converted to String, so you can say:

img[:x] = 200

Also works with pseudoattributes:

foo[:@bar] = "x"


831
832
833
834
835
836
837
838
839
840
841
842
843
# File 'lib/magic_xml.rb', line 831

def []=(key, value)
  if key.to_s[0] == ?@
    tag = key.to_s[1..-1].to_sym
    c = child(tag)
    if c
      c.contents = [value.to_s]
    else
      self << XML.new(tag, value.to_s)
    end
  else
    @attrs[key] = value.to_s
  end
end

#add_pretty_printing!Object

Add pretty-printing whitespace. Also normalizes the XML.



1113
1114
1115
1116
1117
# File 'lib/magic_xml.rb', line 1113

def add_pretty_printing!
  normalize!
  real_add_pretty_printing!
  normalize!
end

#child(pat = nil, *rest) ⇒ Object

Equivalent to node.children(pat, *rest) Returns nil if there aren’t any matching children



1197
1198
1199
1200
1201
1202
# File 'lib/magic_xml.rb', line 1197

def child(pat=nil, *rest)
  children(pat, *rest) {|c|
    return c
  }
  return nil
end

#children(pat = nil, *rest, &blk) ⇒ Object

XML#children(pattern, more_patterns) Return all children of a node with tags matching tag. Also:

  • children(:a, :b) == children(:a).children(:b)

  • children(:a, :*, :c) == children(:a).descendants(:c)



1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
# File 'lib/magic_xml.rb', line 1218

def children(pat=nil, *rest, &blk)
  return descendants(*rest, &blk) if pat == :*
  res = []
  @contents.each{|c|
    if pat.nil? or pat === c
      if rest == []
        res << c
        yield c if block_given?
      else
        res += c.children(*rest, &blk)
      end
    end
  }
  res
end

#children_sort_by(*args, &blk) ⇒ Object

Sort children of XML element.



383
384
385
# File 'lib/magic_xml.rb', line 383

def children_sort_by(*args, &blk)
  self.dup{ @contents = @contents.sort_by(*args, &blk) }
end

#deep_map(pat, &blk) ⇒ Object

Change elements based on pattern



1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
# File 'lib/magic_xml.rb', line 1261

def deep_map(pat, &blk)
  if self =~ pat
    yield self
  else
    r = XML.new(self.name, self.attrs)
    each{|c|
      if c.is_a? XML
        r << c.deep_map(pat, &blk)
      else
        r << c
      end
    }
    r
  end
end

#descendant(pat = nil, *rest) ⇒ Object

Equivalent to node.descendants(pat, *rest) Returns nil if there aren’t any matching descendants



1206
1207
1208
1209
1210
1211
# File 'lib/magic_xml.rb', line 1206

def descendant(pat=nil, *rest)
  descendants(pat, *rest) {|c|
    return c
  }
  return nil
end

#descendants(pat = nil, *rest, &blk) ⇒ Object

  • XML#descendants

  • XML#descendants(pattern)

  • XML#descendants(pattern, more_patterns)

Return all descendants of a node matching the pattern. If pattern==nil, simply return all descendants. Optionally run a block on each of them if a block was given. If pattern==nil, also match Strings !



1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
# File 'lib/magic_xml.rb', line 1242

def descendants(pat=nil, *rest, &blk)
  res = []
  @contents.each{|c|
    if pat.nil? or pat === c
      if rest == []
        res << c
        yield c if block_given?
      else
        res += c.children(*rest, &blk)
      end
    end
    if c.is_a? XML
      res += c.descendants(pat, *rest, &blk)
    end
  }
  res
end

#dup(&blk) ⇒ Object

This is not a trivial method - first it does a deep copy, second it takes a block which is instance_eval’ed, so you can do things like:

  • node.dup{ @name = :foo }

  • node.dup{ self = “blue” }



1139
1140
1141
1142
1143
1144
1145
1146
1147
# File 'lib/magic_xml.rb', line 1139

def dup(&blk)
  new_obj = self.raw_dup
  # Attr values stay shared - ugly
  new_obj.attrs = new_obj.attrs.dup
  new_obj.contents = new_obj.contents.map{|c| c.dup}

  new_obj.instance_eval(&blk) if blk
  return new_obj
end

#each(*selector, &blk) ⇒ Object

Iterate over children, possibly with a selector



372
373
374
375
# File 'lib/magic_xml.rb', line 372

def each(*selector, &blk)
  children(*selector, &blk)
  self
end

#exec!(&blk) ⇒ Object

Make monadic interface more “official”

  • node.exec! { foo!; bar! }

is equivalent to

  • node << xml(:foo) << xml(:bar)



957
958
959
# File 'lib/magic_xml.rb', line 957

def exec!(&blk)
  instance_eval(&blk)
end

#inspect(include_children = 0) ⇒ Object

Convert to a well-formatted XML, but without children information. This is a reasonable format for irb and debugging. If you want to see a few levels of children, call inspect(2) and so on



798
799
800
801
802
803
804
805
806
807
# File 'lib/magic_xml.rb', line 798

def inspect(include_children=0)
  "<#{@name}" + @attrs.sort.map{|k,v| " #{k}='#{v.xml_attr_escape}'"}.join +
  if @contents.size == 0
    "/>"
  elsif include_children == 0
    ">...</#{name}>"
  else
    ">" + @contents.map{|x| if x.is_a? String then x.xml_escape else x.inspect(include_children-1) end}.join + "</#{name}>"
  end
end

#map(pat = nil) ⇒ Object

FIXME: do we want a shallow or a deep copy here ? Map children, but leave the name/attributes



1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
# File 'lib/magic_xml.rb', line 1279

def map(pat=nil)
  r = XML.new(self.name, self.attrs)
  each{|c|
    if !pat || (c.is_a?(XML) && c =~ pat)
      r << yield(c)
    else
      r << c
    end
  }
  r
end

#normalize!Object

Normalization means joining strings and getting rid of “”s, recursively



1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
# File 'lib/magic_xml.rb', line 1163

def normalize!
  new_contents = []
  @contents.each{|c|
    if c.is_a? String
      next if c == ""
      if new_contents[-1].is_a? String
        new_contents[-1] += c
        next
      end
    else
      c.normalize!
    end
    new_contents.push c
  }
  @contents = new_contents
end

#range(range_start, range_end, end_reached_cb = nil) ⇒ Object

Select a subtree NOTE: Uses object_id of the start/end tags ! They have to be the same, not just identical ! <foo>0<a>1</a><b/><c/><d>2</d><e/>3</foo>.range(<a>1</a>, <d>2</d>) returns <foo><b/><c/></foo> start and end and their descendants are not included in the result tree. Either start or end can be nil.

  • If both start and end are nil, return whole tree.

  • If start is nil, return subtree up to range_end.

  • If start is not inside the tree, return nil.

  • If end is nil, return subtree from start

  • If end is not inside the tree, return subtree from start.

  • If end is before or below start, or they’re the same node, the result is unspecified.

  • if end comes directly after start, or as first node when start==nil, return path reaching there.



977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
# File 'lib/magic_xml.rb', line 977

def range(range_start, range_end, end_reached_cb=nil)
  if range_start == nil
    result = XML.new(name, attrs)
  else
    result = nil
  end
  @contents.each {|c|
    # end reached !
    if range_end and c.object_id == range_end.object_id
      end_reached_cb.call if end_reached_cb
      break
    end
    # start reached !
    if range_start and c.object_id == range_start.object_id
      result = XML.new(name, attrs)
      next
    end
    if result # We already started
      if c.is_a? XML
        break_me = false
        result.add! c.range(nil, range_end, lambda{ break_me = true })
        if break_me
          end_reached_cb.call if end_reached_cb
          break
        end
      else # String/XML_PI/XML_Comment
        result.add! c
      end
    else
      # Strings/XML_PI/XML_Comment obviously cannot start a range
      if c.is_a? XML
        break_me = false
        r = c.range(range_start, range_end, lambda{ break_me = true })
        if r
          # start reached !
          result = XML.new(name, attrs, r)
        end
        if break_me
          # end reached !
          end_reached_cb.call if end_reached_cb
          break
        end
      end
    end
  }
  return result
end

#raw_dupObject



1133
# File 'lib/magic_xml.rb', line 1133

alias_method :raw_dup, :dup

#real_method_missingObject



942
# File 'lib/magic_xml.rb', line 942

alias_method :real_method_missing, :method_missing

#remove_pretty_printing!(exceptions = nil) ⇒ Object

Get rid of pretty-printing whitespace. Also normalizes the XML.



1088
1089
1090
1091
1092
# File 'lib/magic_xml.rb', line 1088

def remove_pretty_printing!(exceptions=nil)
  normalize!
  real_remove_pretty_printing!(exceptions)
  normalize!
end

#sort(*args, &blk) ⇒ Object

Sort children of XML element.

Using sort is highly wrong, as XML (and XML-extras) is not even Comparable. Use sort_by instead.

Unless you define your own XML#<=> operator, or do something equally weird.



393
394
395
# File 'lib/magic_xml.rb', line 393

def sort(*args, &blk)
  self.dup{ @contents = @contents.sort(*args, &blk) }
end

#sort_by(*args, &blk) ⇒ Object

Sort XML children of XML element.



378
379
380
# File 'lib/magic_xml.rb', line 378

def sort_by(*args, &blk)
  self.dup{ @contents = @contents.select{|c| c.is_a? XML}.sort_by(*args, &blk) }
end

#subsequence(range_start, range_end, start_seen_cb = nil, end_seen_cb = nil) ⇒ Object

XML#subsequence is similar to XML#range, but instead of trimmed subtree in returns a list of elements The same elements are included in both cases, but here we do not include any parents !

<foo><a/><b/><c/></foo>.range(a,c) => <foo><b/></foo> <foo><a/><b/><c/></foo>.subsequence(a,c) => <b/>

<foo><a><a1/></a><b/><c/></foo>.range(a1,c) => <foo><a/><b/></foo> # Does <a/> make sense ? <foo><a><a1/></a><b/><c/></foo>.subsequence(a1,c) => <b/>

<foo><a><a1/><a2/></a><b/><c/></foo>.range(a1,c) => <foo><a><a2/></a><b/></foo> <foo><a><a1/><a2/></a><b/><c/></foo>.subsequence(a1,c) => <a2/><b/>

And we return [], not nil if nothing matches



1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
# File 'lib/magic_xml.rb', line 1040

def subsequence(range_start, range_end, start_seen_cb=nil, end_seen_cb=nil)
  result = []
  start_seen = range_start.nil?
  @contents.each{|c|
    if range_end and range_end.object_id == c.object_id
      end_seen_cb.call if end_seen_cb
      break
    end
    if range_start and range_start.object_id == c.object_id
      start_seen = true
      start_seen_cb.call if start_seen_cb
      next
    end
    if start_seen
      if c.is_a? XML
        break_me = false
        result += c.subsequence(nil, range_end, nil, lambda{break_me=true})
        break if break_me
      else # String/XML_PI/XML_Comment
        result << c
      end
    else
      # String/XML_PI/XML_Comment cannot start a subsequence
      if c.is_a? XML
        break_me = false
        result += c.subsequence(range_start, range_end, lambda{start_seen=true}, lambda{break_me=true})
        break if break_me
      end
    end
  }
  # Include starting tag if it was right from the range_start
  # Otherwise, return just the raw sequence
  result = [XML.new(@name, @attrs, result)] if range_start == nil
  return result
end

#textObject

Return text below the node, stripping all XML tags, “<foo>Hello, <bar>world</bar>!</foo>”.xml_parse.text returns “Hello, world!”



1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
# File 'lib/magic_xml.rb', line 1183

def text
  res = ""
  @contents.each{|c|
    if c.is_a? XML
      res << c.text
    elsif c.is_a? String
      res << c
    end # Ignore XML_PI/XML_Comment
  }
  res
end

#text!(*args) ⇒ Object

Add some String children (all attributes get to_s’ed)



1151
1152
1153
# File 'lib/magic_xml.rb', line 1151

def text!(*args)
  args.each{|s| self << s.to_s}
end

#to_sObject

Convert to a well-formatted XML



786
787
788
789
790
791
792
793
# File 'lib/magic_xml.rb', line 786

def to_s
  "<#{@name}" + @attrs.sort.map{|k,v| " #{k}='#{v.xml_attr_escape}'"}.join +
  if @contents.size == 0
    "/>"
  else
    ">" + @contents.map{|x| if x.is_a? String then x.xml_escape else x.to_s end}.join + "</#{name}>"
  end
end

#xml!(*args, &blk) ⇒ Object

Add XML child



1155
1156
1157
# File 'lib/magic_xml.rb', line 1155

def xml!(*args, &blk)
  @contents << XML.new(*args, &blk)
end