Class: Asciidoctor::Parser

Inherits:
Object
  • Object
show all
Includes:
Logging
Defined in:
lib/asciidoctor/parser.rb

Overview

# => Asciidoctor::Block

Defined Under Namespace

Classes: BlockMatchData

Constant Summary collapse

TAB =

String for matching tab character

?\t
TabIndentRx =

Regexp for leading tab indentation

/^\t+/
StartOfBlockProc =
proc {|l| ((l.start_with? '[') && (BlockAttributeLineRx.match? l)) || (is_delimited_block? l) }
StartOfListProc =
proc {|l| AnyListRx.match? l }
StartOfBlockOrListProc =
proc {|l| (is_delimited_block? l) || ((l.start_with? '[') && (BlockAttributeLineRx.match? l)) || (AnyListRx.match? l) }
NoOp =
nil
AuthorKeys =
['author', 'authorinitials', 'firstname', 'middlename', 'lastname', 'email']
TableCellHorzAlignments =

Internal: A Hash mapping horizontal alignment abbreviations to alignments that can be applied to a table cell (or to all cells in a column)

{
  '<' => 'left',
  '>' => 'right',
  '^' => 'center'
}
TableCellVertAlignments =

Internal: A Hash mapping vertical alignment abbreviations to alignments that can be applied to a table cell (or to all cells in a column)

{
  '<' => 'top',
  '>' => 'bottom',
  '^' => 'middle'
}
TableCellStyles =

Internal: A Hash mapping styles abbreviations to styles that can be applied to a table cell (or to all cells in a column)

{
  'd' => :none,
  's' => :strong,
  'e' => :emphasis,
  'm' => :monospaced,
  'h' => :header,
  'l' => :literal,
  'a' => :asciidoc
}

Class Method Summary collapse

Methods included from Logging

#logger, #message_with_context

Class Method Details

.adjust_indentation!(lines, indent_size = 0, tab_size = 0) ⇒ Object

Remove the block indentation (the amount of whitespace of the least indented line), replace tabs with spaces (using proper tab expansion logic) and, finally, indent the lines by the margin width. Modifies the input Array directly.

This method preserves the significant indentation (that exceeding the block indent) on each line.

lines - The Array of String lines to process (no trailing newlines) indent_size - The Integer number of spaces to readd to the start of non-empty lines after removing the indentation.

If this value is < 0, the existing indentation is preserved (optional, default: 0)

tab_size - the Integer number of spaces to use in place of a tab. A value of <= 0 disables the replacement

(optional, default: 0)

Examples

source = <<EOS
    def names
      @name.split
    end
EOS

source.split ?\n
# => ["    def names", "      @names.split", "    end"]

puts (Parser.adjust_indentation! source.split ?\n).join ?\n
# => def names
# =>   @names.split
# => end

returns Nothing



2656
2657
2658
2659
2660
2661
2662
2663
2664
2665
2666
2667
2668
2669
2670
2671
2672
2673
2674
2675
2676
2677
2678
2679
2680
2681
2682
2683
2684
2685
2686
2687
2688
2689
2690
2691
2692
2693
2694
2695
2696
2697
2698
2699
2700
2701
2702
2703
2704
2705
2706
2707
2708
2709
2710
2711
2712
2713
2714
2715
2716
2717
2718
2719
2720
2721
2722
2723
2724
2725
2726
2727
2728
2729
2730
2731
# File 'lib/asciidoctor/parser.rb', line 2656

def self.adjust_indentation! lines, indent_size = 0, tab_size = 0
  return if lines.empty?

  # expand tabs if a tab character is detected and tab_size > 0
  if tab_size > 0 && lines.any? {|line| line.include? TAB }
    full_tab_space = ' ' * tab_size
    lines.map! do |line|
      if line.empty?
        line
      elsif (tab_idx = line.index TAB)
        if tab_idx == 0
          leading_tabs = 0
          line.each_byte do |b|
            break unless b == 9
            leading_tabs += 1
          end
          line = %(#{full_tab_space * leading_tabs}#{line.slice leading_tabs, line.length})
          next line unless line.include? TAB
        end
        # keeps track of how many spaces were added to adjust offset in match data
        spaces_added = 0
        idx = 0
        result = ''
        line.each_char do |c|
          if c == TAB
            # calculate how many spaces this tab represents, then replace tab with spaces
            if (offset = idx + spaces_added) % tab_size == 0
              spaces_added += (tab_size - 1)
              result = result + full_tab_space
            else
              unless (spaces = tab_size - offset % tab_size) == 1
                spaces_added += (spaces - 1)
              end
              result = result + (' ' * spaces)
            end
          else
            result = result + c
          end
          idx += 1
        end
        result
      else
        line
      end
    end
  end

  # skip block indent adjustment if indent_size is < 0
  return if indent_size < 0

  # determine block indent (assumes no whitespace-only lines are present)
  block_indent = nil
  lines.each do |line|
    next if line.empty?
    if (line_indent = line.length - line.lstrip.length) == 0
      block_indent = nil
      break
    end
    block_indent = line_indent unless block_indent && block_indent < line_indent
  end

  # remove block indent then apply indent_size if specified
  # NOTE block_indent is > 0 if not nil
  if indent_size == 0
    lines.map! {|line| line.empty? ? line : (line.slice block_indent, line.length) } if block_indent
  else
    new_block_indent = ' ' * indent_size
    if block_indent
      lines.map! {|line| line.empty? ? line : new_block_indent + (line.slice block_indent, line.length) }
    else
      lines.map! {|line| line.empty? ? line : new_block_indent + line }
    end
  end

  nil
end

.atx_section_title?(line) ⇒ Boolean

Checks whether the line given is an atx section title.

The level returned is 1 less than number of leading markers.

line - [String] candidate title with leading atx marker.

Returns the [Integer] section level if this line is an atx section title, otherwise nothing.

Returns:

  • (Boolean)


1676
1677
1678
1679
1680
1681
# File 'lib/asciidoctor/parser.rb', line 1676

def self.atx_section_title? line
  if Compliance.markdown_syntax ? ((line.start_with? '=', '#') && ExtAtxSectionTitleRx =~ line) :
      ((line.start_with? '=') && AtxSectionTitleRx =~ line)
    $1.length - 1
  end
end

.build_block(block_context, content_model, terminator, parent, reader, attributes, options = {}) ⇒ Object

whether a block supports compound content should be a config setting if terminator is false, that means the all the lines in the reader should be parsed NOTE could invoke filter in here, before and after parsing



984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
# File 'lib/asciidoctor/parser.rb', line 984

def self.build_block(block_context, content_model, terminator, parent, reader, attributes, options = {})
  if content_model == :skip
    skip_processing, parse_as_content_model = true, :simple
  elsif content_model == :raw
    skip_processing, parse_as_content_model = false, :simple
  else
    skip_processing, parse_as_content_model = false, content_model
  end

  if terminator.nil?
    if parse_as_content_model == :verbatim
      lines = reader.read_lines_until break_on_blank_lines: true, break_on_list_continuation: true
    else
      content_model = :simple if content_model == :compound
      # TODO we could also skip processing if we're able to detect reader is a BlockReader
      lines = read_paragraph_lines reader, false, skip_line_comments: true, skip_processing: skip_processing
      # QUESTION check for empty lines after grabbing lines for simple content model?
    end
    block_reader = nil
  elsif parse_as_content_model != :compound
    lines = reader.read_lines_until terminator: terminator, skip_processing: skip_processing, context: block_context, cursor: :at_mark
    block_reader = nil
  # terminator is false when reader has already been prepared
  elsif terminator == false
    lines = nil
    block_reader = reader
  else
    lines = nil
    block_cursor = reader.cursor
    block_reader = Reader.new reader.read_lines_until(terminator: terminator, skip_processing: skip_processing, context: block_context, cursor: :at_mark), block_cursor
  end

  if content_model == :verbatim
    tab_size = (attributes['tabsize'] || parent.document.attributes['tabsize']).to_i
    if (indent = attributes['indent'])
      adjust_indentation! lines, indent.to_i, tab_size
    elsif tab_size > 0
      adjust_indentation! lines, -1, tab_size
    end
  elsif content_model == :skip
    # QUESTION should we still invoke process method if extension is specified?
    return
  end

  if (extension = options[:extension])
    # QUESTION do we want to delete the style?
    attributes.delete('style')
    if (block = extension.process_method[parent, block_reader || (Reader.new lines), attributes.merge])
      attributes.replace block.attributes
      # FIXME if the content model is set to compound, but we only have simple in this context, then
      # forcefully set the content_model to simple to prevent parsing blocks from children
      # TODO document this behavior!!
      if block.content_model == :compound && !(lines = block.lines).empty?
        content_model = :compound
        block_reader = Reader.new lines
      end
    else
      return
    end
  else
    block = Block.new(parent, block_context, content_model: content_model, source: lines, attributes: attributes)
  end

  # QUESTION should we have an explicit map or can we rely on check for *-caption attribute?
  if (attributes.key? 'title') && block.context != :admonition &&
      (parent.document.attributes.key? %(#{block.context}-caption))
    block.title = attributes.delete 'title'
    block.assign_caption(attributes.delete 'caption')
  end

  # reader is confined within boundaries of a delimited block, so look for
  # blocks until there are no more lines
  parse_blocks block_reader, block if content_model == :compound

  block
end

.catalog_callouts(text, document) ⇒ Object

Internal: Catalog any callouts found in the text, but don’t process them

text - The String of text in which to look for callouts document - The current document in which the callouts are stored

Returns A Boolean indicating whether callouts were found



1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
# File 'lib/asciidoctor/parser.rb', line 1110

def self.catalog_callouts(text, document)
  found = false
  autonum = 0
  text.scan CalloutScanRx do
    document.callouts.register $2 == '.' ? (autonum += 1).to_s : $2 unless $&.start_with? '\\'
    # we have to mark as found even if it's escaped so it can be unescaped
    found = true
  end if text.include? '<'
  found
end

.catalog_inline_anchor(id, reftext, node, location, doc = node.document) ⇒ Object

Internal: Catalog a matched inline anchor.

id - The String id of the anchor reftext - The optional String reference text of the anchor node - The AbstractNode parent node of the anchor node location - The source location (file and line) where the anchor was found doc - The document to which the node belongs; computed from node if not specified

Returns nothing



1130
1131
1132
1133
1134
1135
1136
1137
# File 'lib/asciidoctor/parser.rb', line 1130

def self.catalog_inline_anchor id, reftext, node, location, doc = node.document
  reftext = doc.sub_attributes reftext if reftext && (reftext.include? ATTR_REF_HEAD)
  unless doc.register :refs, [id, (Inline.new node, :anchor, reftext, type: :ref, id: id)]
    location = location.cursor if Reader === location
    logger.warn message_with_context %(id assigned to anchor already in use: #{id}), source_location: location
  end
  nil
end

.catalog_inline_anchors(text, block, document, reader) ⇒ Object

Internal: Catalog any inline anchors found in the text (but don’t convert)

text - The String text in which to look for inline anchors block - The block in which the references should be searched document - The current Document on which the references are stored

Returns nothing



1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
# File 'lib/asciidoctor/parser.rb', line 1146

def self.catalog_inline_anchors text, block, document, reader
  text.scan InlineAnchorScanRx do
    if (id = $1)
      if (reftext = $2)
        next if (reftext.include? ATTR_REF_HEAD) && (reftext = document.sub_attributes reftext).empty?
      end
    else
      id = $3
      if (reftext = $4)
        reftext = reftext.gsub '\]', ']' if reftext.include? ']'
        next if (reftext.include? ATTR_REF_HEAD) && (reftext = document.sub_attributes reftext).empty?
      end
    end
    unless document.register :refs, [id, (Inline.new block, :anchor, reftext, type: :ref, id: id)]
      location = reader.cursor_at_mark
      if (offset = ($`.count LF) + (($&.start_with? LF) ? 1 : 0)) > 0
        (location = location.dup).advance offset
      end
      logger.warn message_with_context %(id assigned to anchor already in use: #{id}), source_location: location
    end
  end if (text.include? '[[') || (text.include? 'or:')
  nil
end

.catalog_inline_biblio_anchor(id, reftext, node, reader) ⇒ Object

Internal: Catalog the bibliography inline anchor found in the start of the list item (but don’t convert)

id - The String id of the anchor reftext - The optional String reference text of the anchor node - The AbstractNode parent node of the anchor node reader - The source Reader for the current Document, positioned at the current list item

Returns nothing



1178
1179
1180
1181
1182
1183
1184
# File 'lib/asciidoctor/parser.rb', line 1178

def self.catalog_inline_biblio_anchor id, reftext, node, reader
  # QUESTION should we sub attributes in reftext (like with regular anchors)?
  unless node.document.register :refs, [id, (Inline.new node, :anchor, reftext && %([#{reftext}]), type: :bibref, id: id)]
    logger.warn message_with_context %(id assigned to bibliography anchor already in use: #{id}), source_location: reader.cursor
  end
  nil
end

.initialize_section(reader, parent, attributes = {}) ⇒ Object

Internal: Initialize a new Section object and assign any attributes provided

The information for this section is retrieved by parsing the lines at the current position of the reader.

reader - the source reader parent - the parent Section or Document of this Section attributes - a Hash of attributes to assign to this section (default: {})

Returns the section [Block]



1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
# File 'lib/asciidoctor/parser.rb', line 1567

def self.initialize_section reader, parent, attributes = {}
  document = parent.document
  book = (doctype = document.doctype) == 'book'
  source_location = reader.cursor if document.sourcemap
  sect_style = attributes[1]
  sect_id, sect_reftext, sect_title, sect_level, sect_atx = parse_section_title reader, document, attributes['id']

  if sect_reftext
    attributes['reftext'] = sect_reftext
  else
    sect_reftext = attributes['reftext']
  end

  if sect_style
    if book && sect_style == 'abstract'
      sect_name, sect_level = 'chapter', 1
    elsif (sect_style.start_with? 'sect') && (SectionLevelStyleRx.match? sect_style)
      sect_name = 'section'
    else
      sect_name, sect_special = sect_style, true
      sect_level = 1 if sect_level == 0
      sect_numbered = sect_name == 'appendix'
    end
  elsif book
    sect_name = sect_level == 0 ? 'part' : (sect_level > 1 ? 'section' : 'chapter')
  elsif doctype == 'manpage' && (sect_title.casecmp 'synopsis') == 0
    sect_name, sect_special = 'synopsis', true
  else
    sect_name = 'section'
  end

  section = Section.new parent, sect_level
  section.id, section.title, section.sectname, section.source_location = sect_id, sect_title, sect_name, source_location
  if sect_special
    section.special = true
    if sect_numbered
      section.numbered = true
    elsif document.attributes['sectnums'] == 'all'
      section.numbered = book && sect_level == 1 ? :chapter : true
    end
  elsif document.attributes['sectnums'] && sect_level > 0
    # NOTE a special section here is guaranteed to be nested in another section
    section.numbered = section.special ? parent.numbered && true : true
  elsif book && sect_level == 0 && document.attributes['partnums']
    section.numbered = true
  end

  # generate an ID if one was not embedded or specified as anchor above section title
  if (id = section.id || (section.id = (document.attributes.key? 'sectids') ? (Section.generate_id section.title, document) : nil))
    unless document.register :refs, [id, section]
      logger.warn message_with_context %(id assigned to section already in use: #{id}), source_location: (reader.cursor_at_line reader.lineno - (sect_atx ? 1 : 2))
    end
  end

  section.update_attributes(attributes)
  reader.skip_blank_lines

  section
end

.is_delimited_block?(line, return_match_data = nil) ⇒ Boolean

Public: Determines whether this line is the start of a known delimited block.

Returns the BlockMatchData (if return_match_data is true) or true (if return_match_data is false) if this line is the start of a delimited block, otherwise nothing.

Returns:

  • (Boolean)


939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
# File 'lib/asciidoctor/parser.rb', line 939

def self.is_delimited_block? line, return_match_data = nil
  # highly optimized for best performance
  return unless (line_len = line.length) > 1 && DELIMITED_BLOCK_HEADS[line.slice 0, 2]
  # open block
  if line_len == 2
    tip = line
    tip_len = 2
  else
    # all other delimited blocks, including fenced code
    if line_len < 5
      tip = line
      tip_len = line_len
    else
      tip = line.slice 0, (tip_len = 4)
    end
    # special case for fenced code blocks
    if Compliance.markdown_syntax && (tip.start_with? '`')
      if tip_len == 4
        if tip == '````'
          return
        elsif (tip = tip.chop) == '```'
          line = tip
          line_len = tip_len = 3
        else
          return
        end
      elsif tip == '```'
        # keep it
      else
        return
      end
    elsif tip_len == 3
      return
    end
  end
  # NOTE line matches the tip when delimiter is minimum length or fenced code
  context, masq = DELIMITED_BLOCKS[tip]
  if context && (line_len == tip_len || (uniform? (line.slice 1, line_len), DELIMITED_BLOCK_TAILS[tip], (line_len - 1)))
    return_match_data ? (BlockMatchData.new context, masq, tip, line) : true
  end
end

.is_next_line_doctitle?(reader, attributes, leveloffset) ⇒ Boolean

Internal: Convenience API for checking if the next line on the Reader is the document title

reader - the source Reader attributes - a Hash of attributes collected above the current line leveloffset - an Integer (or integer String value) the represents the current leveloffset

returns true if the Reader is positioned at the document title, false otherwise

Returns:

  • (Boolean)


1651
1652
1653
1654
1655
1656
1657
# File 'lib/asciidoctor/parser.rb', line 1651

def self.is_next_line_doctitle? reader, attributes, leveloffset
  if leveloffset
    (sect_level = is_next_line_section? reader, attributes) && (sect_level + leveloffset.to_i == 0)
  else
    (is_next_line_section? reader, attributes) == 0
  end
end

.is_next_line_section?(reader, attributes) ⇒ Boolean

Internal: Checks if the next line on the Reader is a section title

reader - the source Reader attributes - a Hash of attributes collected above the current line

Returns the Integer section level if the Reader is positioned at a section title or nil otherwise

Returns:

  • (Boolean)


1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
# File 'lib/asciidoctor/parser.rb', line 1633

def self.is_next_line_section?(reader, attributes)
  if (style = attributes[1]) && (style == 'discrete' || style == 'float')
    return
  elsif Compliance.underline_style_section_titles
    next_lines = reader.peek_lines 2, style && style == 'comment'
    is_section_title?(next_lines[0] || '', next_lines[1])
  else
    atx_section_title?(reader.peek_line || '')
  end
end

.is_section_title?(line1, line2 = nil) ⇒ Boolean

Public: Checks whether the lines given are an atx or setext section title.

line1 - [String] candidate title. line2 - [String] candidate underline (default: nil).

Returns the [Integer] section level if these lines are a section title, otherwise nothing.

Returns:

  • (Boolean)


1665
1666
1667
# File 'lib/asciidoctor/parser.rb', line 1665

def self.is_section_title?(line1, line2 = nil)
  atx_section_title?(line1) || (line2.nil_or_empty? ? nil : setext_section_title?(line1, line2))
end

.is_sibling_list_item?(line, list_type, sibling_trait) ⇒ Boolean

Internal: Determine whether the this line is a sibling list item according to the list type and trait (marker) provided.

line - The String line to check list_type - The context of the list (:olist, :ulist, :colist, :dlist) sibling_trait - The String marker for the list or the Regexp to match a sibling

Returns a Boolean indicating whether this line is a sibling list item given the criteria provided

Returns:

  • (Boolean)


2254
2255
2256
2257
2258
2259
2260
# File 'lib/asciidoctor/parser.rb', line 2254

def self.is_sibling_list_item? line, list_type, sibling_trait
  if ::Regexp === sibling_trait
    sibling_trait.match? line
  else
    ListRxMap[list_type] =~ line && sibling_trait == (resolve_list_marker list_type, $1)
  end
end

.next_block(reader, parent, attributes = {}, options = {}) ⇒ Object

Public: Parse and return the next Block at the Reader’s current location

This method begins by skipping over blank lines to find the start of the next block (paragraph, block macro, or delimited block). If a block is found, that block is parsed, initialized as a Block object, and returned. Otherwise, the method returns nothing.

Regular expressions from the Asciidoctor module are used to match block boundaries. The ensuing lines are then processed according to the content model.

reader - The Reader from which to retrieve the next Block. parent - The Document, Section or Block to which the next Block belongs. attributes - A Hash of attributes that will become the attributes

associated with the parsed Block (default: {}).

options - An options Hash to control parsing (default: {}):

* :text indicates that the parser is only looking for text content

Returns a Block object built from the parsed content of the processed lines, or nothing if no block is found.



476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
# File 'lib/asciidoctor/parser.rb', line 476

def self.next_block(reader, parent, attributes = {}, options = {})
  # skip ahead to the block content; bail if we've reached the end of the reader
  return unless (skipped = reader.skip_blank_lines)

  # check for option to find list item text only
  # if skipped a line, assume a list continuation was
  # used and block content is acceptable
  if (text_only = options[:text]) && skipped > 0
    options.delete :text
    text_only = nil
  end

  document = parent.document

  if options.fetch :parse_metadata, true
    # read lines until there are no more metadata lines to read
    while  reader, document, attributes, options
      # discard the line just processed
      reader.shift
      # QUESTION should we clear the attributes? no known cases when it's necessary
      reader.skip_blank_lines || return
    end
  end

  if (extensions = document.extensions)
    block_extensions, block_macro_extensions = extensions.blocks?, extensions.block_macros?
  end

  # QUESTION should we introduce a parsing context object?
  reader.mark
  this_line, doc_attrs, style = reader.read_line, document.attributes, attributes[1]
  block = block_context = cloaked_context = terminator = nil

  if (delimited_block = is_delimited_block? this_line, true)
    block_context = cloaked_context = delimited_block.context
    terminator = delimited_block.terminator
    if style
      unless style == block_context.to_s
        if delimited_block.masq.include? style
          block_context = style.to_sym
        elsif delimited_block.masq.include?('admonition') && ADMONITION_STYLES.include?(style)
          block_context = :admonition
        elsif block_extensions && extensions.registered_for_block?(style, block_context)
          block_context = style.to_sym
        else
          logger.debug message_with_context %(unknown style for #{block_context} block: #{style}), source_location: reader.cursor_at_mark if logger.debug?
          style = block_context.to_s
        end
      end
    else
      style = attributes['style'] = block_context.to_s
    end
  end

  # this loop is used for flow control; it only executes once, and only when delimited_block is not set
  # break once a block is found or at end of loop
  # returns nil if the line should be dropped
  while true
    # process lines verbatim
    if style && Compliance.strict_verbatim_paragraphs && (VERBATIM_STYLES.include? style)
      block_context = style.to_sym
      reader.unshift_line this_line
      # advance to block parsing =>
      break
    end

    # process lines normally
    if text_only
      indented = this_line.start_with? ' ', TAB
    else
      # NOTE move this declaration up if we need it when text_only is false
      md_syntax = Compliance.markdown_syntax
      if this_line.start_with? ' '
        indented, ch0 = true, ' '
        # QUESTION should we test line length?
        if md_syntax && this_line.lstrip.start_with?(*MARKDOWN_THEMATIC_BREAK_CHARS.keys) &&
            #!(this_line.start_with? '    ') &&
            (MarkdownThematicBreakRx.match? this_line)
          # NOTE we're letting break lines (horizontal rule, page_break, etc) have attributes
          block = Block.new(parent, :thematic_break, content_model: :empty)
          break
        end
      elsif this_line.start_with? TAB
        indented, ch0 = true, TAB
      else
        indented, ch0 = false, this_line.chr
        layout_break_chars = md_syntax ? HYBRID_LAYOUT_BREAK_CHARS : LAYOUT_BREAK_CHARS
        if (layout_break_chars.key? ch0) &&
            (md_syntax ? (ExtLayoutBreakRx.match? this_line) : (uniform? this_line, ch0, (ll = this_line.length)) && ll > 2)
          # NOTE we're letting break lines (horizontal rule, page_break, etc) have attributes
          block = Block.new(parent, layout_break_chars[ch0], content_model: :empty)
          break
        # NOTE very rare that a text-only line will end in ] (e.g., inline macro), so check that first
        elsif (this_line.end_with? ']') && (this_line.include? '::')
          #if (this_line.start_with? 'image', 'video', 'audio') && BlockMediaMacroRx =~ this_line
          if (ch0 == 'i' || (this_line.start_with? 'video:', 'audio:')) && BlockMediaMacroRx =~ this_line
            blk_ctx, target, blk_attrs = $1.to_sym, $2, $3
            block = Block.new parent, blk_ctx, content_model: :empty
            if blk_attrs
              case blk_ctx
              when :video
                posattrs = ['poster', 'width', 'height']
              when :audio
                posattrs = []
              else # :image
                posattrs = ['alt', 'width', 'height']
              end
              block.parse_attributes blk_attrs, posattrs, sub_input: true, into: attributes
            end
            # style doesn't have special meaning for media macros
            attributes.delete 'style' if attributes.key? 'style'
            if target.include? ATTR_REF_HEAD
              if (expanded_target = block.sub_attributes target).empty? &&
                  (doc_attrs['attribute-missing'] || Compliance.attribute_missing) == 'drop-line' &&
                  (block.sub_attributes target + ' ', attribute_missing: 'drop-line', drop_line_severity: :ignore).empty?
                attributes.clear
                return
              else
                target = expanded_target
              end
            end
            if blk_ctx == :image
              document.register :images, [target, (attributes['imagesdir'] = doc_attrs['imagesdir'])]
              # NOTE style is the value of the first positional attribute in the block attribute line
              attributes['alt'] ||= style || (attributes['default-alt'] = Helpers.basename(target, true).tr('_-', ' '))
              unless (scaledwidth = attributes.delete 'scaledwidth').nil_or_empty?
                # NOTE assume % units if not specified
                attributes['scaledwidth'] = (TrailingDigitsRx.match? scaledwidth) ? %(#{scaledwidth}%) : scaledwidth
              end
              if attributes.key? 'title'
                block.title = attributes.delete 'title'
                block.assign_caption((attributes.delete 'caption'), 'figure')
              end
            end
            attributes['target'] = target
            break

          elsif ch0 == 't' && (this_line.start_with? 'toc:') && BlockTocMacroRx =~ this_line
            block = Block.new parent, :toc, content_model: :empty
            block.parse_attributes $1, [], into: attributes if $1
            break

          elsif block_macro_extensions ? (CustomBlockMacroRx =~ this_line &&
              (extension = extensions.registered_for_block_macro? $1) || (report_unknown_block_macro = logger.debug?)) :
              (logger.debug? && (report_unknown_block_macro = CustomBlockMacroRx =~ this_line))
            if report_unknown_block_macro
              logger.debug message_with_context %(unknown name for block macro: #{$1}), source_location: reader.cursor_at_mark
            else
              content = $3
              if (target = $2).include? ATTR_REF_HEAD
                if (expanded_target = parent.sub_attributes target).empty? &&
                    (doc_attrs['attribute-missing'] || Compliance.attribute_missing) == 'drop-line' &&
                    (parent.sub_attributes target + ' ', attribute_missing: 'drop-line', drop_line_severity: :ignore).empty?
                  attributes.clear
                  return
                else
                  target = expanded_target
                end
              end
              if extension.config[:content_model] == :attributes
                document.parse_attributes content, extension.config[:positional_attrs] || [], sub_input: true, into: attributes if content
              else
                attributes['text'] = content || ''
              end
              if (default_attrs = extension.config[:default_attrs])
                attributes.update(default_attrs) {|_, old_v| old_v }
              end
              if (block = extension.process_method[parent, target, attributes])
                attributes.replace block.attributes
                break
              else
                attributes.clear
                return
              end
            end
          end
        end
      end
    end

    # haven't found anything yet, continue
    if !indented && (ch0 ||= this_line.chr) == '<' && CalloutListRx =~ this_line
      reader.unshift_line this_line
      block = parse_callout_list(reader, $~, parent, document.callouts)
      attributes['style'] = 'arabic'
      break

    elsif UnorderedListRx.match? this_line
      reader.unshift_line this_line
      attributes['style'] = style = 'bibliography' if !style && Section === parent && parent.sectname == 'bibliography'
      block = parse_list(reader, :ulist, parent, style)
      break

    elsif OrderedListRx.match? this_line
      reader.unshift_line this_line
      block = parse_list(reader, :olist, parent, style)
      attributes['style'] = block.style if block.style
      break

    elsif ((this_line.include? '::') || (this_line.include? ';;')) && DescriptionListRx =~ this_line
      reader.unshift_line this_line
      block = parse_description_list(reader, $~, parent)
      break

    elsif (style == 'float' || style == 'discrete') && (Compliance.underline_style_section_titles ?
        (is_section_title? this_line, reader.peek_line) : !indented && (atx_section_title? this_line))
      reader.unshift_line this_line
      float_id, float_reftext, float_title, float_level = parse_section_title reader, document, attributes['id']
      attributes['reftext'] = float_reftext if float_reftext
      block = Block.new(parent, :floating_title, content_model: :empty)
      block.title = float_title
      attributes.delete 'title'
      block.id = float_id || ((doc_attrs.key? 'sectids') ? (Section.generate_id block.title, document) : nil)
      block.level = float_level
      break

    # FIXME create another set for "passthrough" styles
    # FIXME make this more DRY!
    elsif style && style != 'normal'
      if PARAGRAPH_STYLES.include?(style)
        block_context = style.to_sym
        cloaked_context = :paragraph
        reader.unshift_line this_line
        # advance to block parsing =>
        break
      elsif ADMONITION_STYLES.include?(style)
        block_context = :admonition
        cloaked_context = :paragraph
        reader.unshift_line this_line
        # advance to block parsing =>
        break
      elsif block_extensions && extensions.registered_for_block?(style, :paragraph)
        block_context = style.to_sym
        cloaked_context = :paragraph
        reader.unshift_line this_line
        # advance to block parsing =>
        break
      else
        logger.debug message_with_context %(unknown style for paragraph: #{style}), source_location: reader.cursor_at_mark if logger.debug?
        style = nil
        # continue to process paragraph
      end
    end

    reader.unshift_line this_line

    # a literal paragraph: contiguous lines starting with at least one whitespace character
    # NOTE style can only be nil or "normal" at this point
    if indented && !style
      lines = read_paragraph_lines reader, (list_item = options[:list_item]) && skipped == 0, skip_line_comments: text_only
      adjust_indentation! lines
      block = Block.new(parent, :literal, content_model: :verbatim, source: lines, attributes: attributes)
      if list_item
        # a literal gets special meaning inside of a description list
        block.set_option 'listparagraph'
        block.default_subs = []
      end
    # a normal paragraph: contiguous non-blank/non-continuation lines (left-indented or normal style)
    else
      lines = read_paragraph_lines reader, skipped == 0 && options[:list_item], skip_line_comments: true
      # NOTE don't check indented here since it's extremely rare
      #if text_only || indented
      if text_only
        # if [normal] is used over an indented paragraph, shift content to left margin
        # QUESTION do we even need to shift since whitespace is normalized by XML in this case?
        adjust_indentation! lines if indented && style == 'normal'
        block = Block.new(parent, :paragraph, content_model: :simple, source: lines, attributes: attributes)
      elsif (ADMONITION_STYLE_HEADS.include? ch0) && (this_line.include? ':') && (AdmonitionParagraphRx =~ this_line)
        lines[0] = $' # string after match
        attributes['name'] = admonition_name = (attributes['style'] = $1).downcase
        attributes['textlabel'] = (attributes.delete 'caption') || doc_attrs[%(#{admonition_name}-caption)]
        block = Block.new(parent, :admonition, content_model: :simple, source: lines, attributes: attributes)
      elsif md_syntax && ch0 == '>' && this_line.start_with?('> ')
        lines.map! {|line| line == '>' ? (line.slice 1, line.length) : ((line.start_with? '> ') ? (line.slice 2, line.length) : line) }
        if lines[-1].start_with? '-- '
          credit_line = (credit_line = lines.pop).slice 3, credit_line.length
          unless lines.empty?
            lines.pop while lines[-1].empty?
          end
        end
        attributes['style'] = 'quote'
        # NOTE will only detect discrete (aka free-floating) headings
        # TODO could assume a discrete heading when inside a block context
        # FIXME Reader needs to be created w/ line info
        block = build_block(:quote, :compound, false, parent, Reader.new(lines), attributes)
        if credit_line
          attribution, citetitle = (block.apply_subs credit_line).split ', ', 2
          attributes['attribution'] = attribution if attribution
          attributes['citetitle'] = citetitle if citetitle
        end
      elsif ch0 == '"' && lines.size > 1 && (lines[-1].start_with? '-- ') && (lines[-2].end_with? '"')
        lines[0] = this_line.slice 1, this_line.length # strip leading quote
        credit_line = (credit_line = lines.pop).slice 3, credit_line.length
        lines.pop while lines[-1].empty?
        lines << lines.pop.chop # strip trailing quote
        attributes['style'] = 'quote'
        block = Block.new(parent, :quote, content_model: :simple, source: lines, attributes: attributes)
        attribution, citetitle = (block.apply_subs credit_line).split ', ', 2
        attributes['attribution'] = attribution if attribution
        attributes['citetitle'] = citetitle if citetitle
      else
        # if [normal] is used over an indented paragraph, shift content to left margin
        # QUESTION do we even need to shift since whitespace is normalized by XML in this case?
        adjust_indentation! lines if indented && style == 'normal'
        block = Block.new(parent, :paragraph, content_model: :simple, source: lines, attributes: attributes)
      end

      catalog_inline_anchors((lines.join LF), block, document, reader)
    end

    break # forbid loop from executing more than once
  end unless delimited_block

  # either delimited block or styled paragraph
  unless block
    case block_context
    when :listing, :source
      if block_context == :source || (!attributes[1] && (language = attributes[2] || doc_attrs['source-language']))
        if language
          attributes['style'] = 'source'
          attributes['language'] = language
          AttributeList.rekey attributes, [nil, nil, 'linenums']
        else
          AttributeList.rekey attributes, [nil, 'language', 'linenums']
          if doc_attrs.key? 'source-language'
            attributes['language'] = doc_attrs['source-language']
          end unless attributes.key? 'language'
        end
        if attributes['linenums-option'] || doc_attrs['source-linenums-option']
          attributes['linenums'] = ''
        end unless attributes.key? 'linenums'
        if doc_attrs.key? 'source-indent'
          attributes['indent'] = doc_attrs['source-indent']
        end unless attributes.key? 'indent'
      end
      block = build_block(:listing, :verbatim, terminator, parent, reader, attributes)
    when :fenced_code
      attributes['style'] = 'source'
      if (ll = this_line.length) > 3
        if (comma_idx = (language = this_line.slice 3, ll).index ',')
          if comma_idx > 0
            language = (language.slice 0, comma_idx).strip
            attributes['linenums'] = '' if comma_idx < ll - 4
          else
            attributes['linenums'] = '' if ll > 4
          end
        else
          language = language.lstrip
        end
      end
      if language.nil_or_empty?
        attributes['language'] = doc_attrs['source-language'] if doc_attrs.key? 'source-language'
      else
        attributes['language'] = language
      end
      if attributes['linenums-option'] || doc_attrs['source-linenums-option']
        attributes['linenums'] = ''
      end unless attributes.key? 'linenums'
      if doc_attrs.key? 'source-indent'
        attributes['indent'] = doc_attrs['source-indent']
      end unless attributes.key? 'indent'
      terminator = terminator.slice 0, 3
      block = build_block(:listing, :verbatim, terminator, parent, reader, attributes)
    when :table
      block_cursor = reader.cursor
      block_reader = Reader.new reader.read_lines_until(terminator: terminator, skip_line_comments: true, context: :table, cursor: :at_mark), block_cursor
      # NOTE it's very rare that format is set when using a format hint char, so short-circuit
      unless terminator.start_with? '|', '!'
        # NOTE infer dsv once all other format hint chars are ruled out
        attributes['format'] ||= (terminator.start_with? ',') ? 'csv' : 'dsv'
      end
      block = parse_table(block_reader, parent, attributes)
    when :sidebar
      block = build_block(block_context, :compound, terminator, parent, reader, attributes)
    when :admonition
      attributes['name'] = admonition_name = style.downcase
      attributes['textlabel'] = (attributes.delete 'caption') || doc_attrs[%(#{admonition_name}-caption)]
      block = build_block(block_context, :compound, terminator, parent, reader, attributes)
    when :open, :abstract, :partintro
      block = build_block(:open, :compound, terminator, parent, reader, attributes)
    when :literal
      block = build_block(block_context, :verbatim, terminator, parent, reader, attributes)
    when :example
      block = build_block(block_context, :compound, terminator, parent, reader, attributes)
    when :quote, :verse
      AttributeList.rekey(attributes, [nil, 'attribution', 'citetitle'])
      block = build_block(block_context, (block_context == :verse ? :verbatim : :compound), terminator, parent, reader, attributes)
    when :stem, :latexmath, :asciimath
      attributes['style'] = STEM_TYPE_ALIASES[attributes[2] || doc_attrs['stem']] if block_context == :stem
      block = build_block(:stem, :raw, terminator, parent, reader, attributes)
    when :pass
      block = build_block(block_context, :raw, terminator, parent, reader, attributes)
    when :comment
      build_block(block_context, :skip, terminator, parent, reader, attributes)
      attributes.clear
      return
    else
      if block_extensions && (extension = extensions.registered_for_block? block_context, cloaked_context)
        unless (content_model = extension.config[:content_model]) == :skip
          unless (positional_attrs = extension.config[:positional_attrs] || []).empty?
            AttributeList.rekey(attributes, [nil] + positional_attrs)
          end
          if (default_attrs = extension.config[:default_attrs])
            default_attrs.each {|k, v| attributes[k] ||= v }
          end
          # QUESTION should we clone the extension for each cloaked context and set in config?
          attributes['cloaked-context'] = cloaked_context
        end
        unless (block = build_block block_context, content_model, terminator, parent, reader, attributes, extension: extension)
          attributes.clear
          return
        end
      else
        # this should only happen if there's a misconfiguration
        raise %(Unsupported block type #{block_context} at #{reader.cursor})
      end
    end
  end

  # FIXME we've got to clean this up, it's horrible!
  block.source_location = reader.cursor_at_mark if document.sourcemap
  # FIXME title should be assigned when block is constructed
  block.title = attributes.delete 'title' if attributes.key? 'title'
  # TODO eventually remove the style attribute from the attributes hash
  #block.style = attributes.delete 'style'
  block.style = attributes['style']
  if (block_id = block.id || (block.id = attributes['id']))
    unless document.register :refs, [block_id, block]
      logger.warn message_with_context %(id assigned to block already in use: #{block_id}), source_location: reader.cursor_at_mark
    end
  end
  # FIXME remove the need for this update!
  block.update_attributes attributes unless attributes.empty?
  block.commit_subs

  #if doc_attrs.key? :pending_attribute_entries
  #  doc_attrs.delete(:pending_attribute_entries).each do |entry|
  #    entry.save_to block.attributes
  #  end
  #end

  if block.sub? :callouts
    # No need to sub callouts if none are found when cataloging
    block.remove_sub :callouts unless catalog_callouts block.source, document
  end

  block
end

.next_section(reader, parent, attributes = {}) ⇒ Object

Public: Return the next section from the Reader.

This method process block metadata, content and subsections for this section and returns the Section object and any orphaned attributes.

If the parent is a Document and has a header (document title), then this method will put any non-section blocks at the start of document into a preamble Block. If there are no such blocks, the preamble is dropped.

Since we are reading line-by-line, there’s a chance that metadata that should be associated with the following block gets consumed. To deal with this case, the method returns a running Hash of “orphaned” attributes that get passed to the next Section or Block.

reader - the source Reader parent - the parent Section or Document of this new section attributes - a Hash of metadata that was left orphaned from the

previous Section.

Examples

source
# => "= Greetings\n\nThis is my doc.\n\n== Salutations\n\nIt is awesome."

reader = Reader.new source, nil, normalize: true
# create empty document to parent the section
# and hold attributes extracted from header
doc = Document.new

Parser.next_section(reader, doc)[0].title
# => "Greetings"

Parser.next_section(reader, doc)[0].title
# => "Salutations"

returns a two-element Array containing the Section and Hash of orphaned attributes



297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
# File 'lib/asciidoctor/parser.rb', line 297

def self.next_section reader, parent, attributes = {}
  preamble = intro = part = false

  # check if we are at the start of processing the document
  # NOTE we could drop a hint in the attributes to indicate
  # that we are at a section title (so we don't have to check)
  if parent.context == :document && parent.blocks.empty? && ((has_header = parent.header?) ||
      (attributes.delete 'invalid-header') || !(is_next_line_section? reader, attributes))
    book = (document = parent).doctype == 'book'
    if has_header || (book && attributes[1] != 'abstract')
      preamble = intro = Block.new parent, :preamble, content_model: :compound
      preamble.title = parent.attr 'preface-title' if book && (parent.attr? 'preface-title')
      parent.blocks << preamble
    end
    section = parent
    current_level = 0
    if parent.attributes.key? 'fragment'
      expected_next_level = -1
    # small tweak to allow subsequent level-0 sections for book doctype
    elsif book
      expected_next_level, expected_next_level_alt = 1, 0
    else
      expected_next_level = 1
    end
  else
    book = (document = parent.document).doctype == 'book'
    section = initialize_section reader, parent, attributes
    # clear attributes except for title attribute, which must be carried over to next content block
    attributes = (title = attributes['title']) ? { 'title' => title } : {}
    expected_next_level = (current_level = section.level) + 1
    if current_level == 0
      part = book
    elsif current_level == 1 && section.special
      # NOTE technically preface and abstract sections are only permitted in the book doctype
      unless (sectname = section.sectname) == 'appendix' || sectname == 'preface' || sectname == 'abstract'
        expected_next_level = nil
      end
    end
  end

  reader.skip_blank_lines

  # Parse lines belonging to this section and its subsections until we
  # reach the end of this section level
  #
  # 1. first look for metadata thingies (anchor, attribute list, block title line, etc)
  # 2. then look for a section, recurse if found
  # 3. then process blocks
  #
  # We have to parse all the metadata lines before continuing with the loop,
  # otherwise subsequent metadata lines get interpreted as block content
  while reader.has_more_lines?
     reader, document, attributes
    if (next_level = is_next_line_section?(reader, attributes))
      if document.attr? 'leveloffset'
        next_level += (document.attr 'leveloffset').to_i
        next_level = 0 if next_level < 0
      end
      if next_level > current_level
        if expected_next_level
          unless next_level == expected_next_level || (expected_next_level_alt && next_level == expected_next_level_alt) || expected_next_level < 0
            expected_condition = expected_next_level_alt ? %(expected levels #{expected_next_level_alt} or #{expected_next_level}) : %(expected level #{expected_next_level})
            logger.warn message_with_context %(section title out of sequence: #{expected_condition}, got level #{next_level}), source_location: reader.cursor
          end
        else
          logger.error message_with_context %(#{sectname} sections do not support nested sections), source_location: reader.cursor
        end
        new_section, attributes = next_section reader, section, attributes
        section.assign_numeral new_section
        section.blocks << new_section
      elsif next_level == 0 && section == document
        logger.error message_with_context 'level 0 sections can only be used when doctype is book', source_location: reader.cursor unless book
        new_section, attributes = next_section reader, section, attributes
        section.assign_numeral new_section
        section.blocks << new_section
      else
        # close this section (and break out of the nesting) to begin a new one
        break
      end
    else
      # just take one block or else we run the risk of overrunning section boundaries
      block_cursor = reader.cursor
      if (new_block = next_block reader, intro || section, attributes, parse_metadata: false)
        # REVIEW this may be doing too much
        if part
          if !section.blocks?
            # if this block wasn't marked as [partintro], emulate behavior as if it had
            if new_block.style != 'partintro'
              # emulate [partintro] paragraph
              if new_block.context == :paragraph
                new_block.context = :open
                new_block.style = 'partintro'
              # emulate [partintro] open block
              else
                new_block.parent = (intro = Block.new section, :open, content_model: :compound)
                intro.style = 'partintro'
                section.blocks << intro
              end
            end
          elsif section.blocks.size == 1
            first_block = section.blocks[0]
            # open the [partintro] open block for appending
            if !intro && first_block.content_model == :compound
              logger.error message_with_context 'illegal block content outside of partintro block', source_location: block_cursor
            # rebuild [partintro] paragraph as an open block
            elsif first_block.content_model != :compound
              new_block.parent = (intro = Block.new section, :open, content_model: :compound)
              intro.style = 'partintro'
              section.blocks.shift
              if first_block.style == 'partintro'
                first_block.context = :paragraph
                first_block.style = nil
              end
              intro << first_block
              section.blocks << intro
            end
          end
        end

        (intro || section).blocks << new_block
        attributes.clear
      #else
      #  # don't clear attributes if we don't find a block because they may
      #  # be trailing attributes that didn't get associated with a block
      end
    end

    reader.skip_blank_lines || break
  end

  if part
    unless section.blocks? && section.blocks[-1].context == :section
      logger.error message_with_context 'invalid part, must have at least one section (e.g., chapter, appendix, etc.)', source_location: reader.cursor
    end
  # NOTE we could try to avoid creating a preamble in the first place, though
  # that would require reworking assumptions in next_section since the preamble
  # is treated like an untitled section
  elsif preamble # implies parent == document
    if preamble.blocks?
      # unwrap standalone preamble (i.e., document has no sections) except for books, if permissible
      unless book || document.blocks[1] || !Compliance.unwrap_standalone_preamble
        document.blocks.shift
        while (child_block = preamble.blocks.shift)
          document << child_block
        end
      end
    # drop the preamble if it has no content
    else
      document.blocks.shift
    end
  end

  # The attributes returned here are orphaned attributes that fall at the end
  # of a section that need to get transfered to the next section
  # see "trailing block attributes transfer to the following section" in
  # test/attributes_test.rb for an example
  [section != parent ? section : nil, attributes.merge]
end

.parse(reader, document, options = {}) ⇒ Object

Public: Parses AsciiDoc source read from the Reader into the Document

This method is the main entry-point into the Parser when parsing a full document. It first looks for and, if found, processes the document title. It then proceeds to iterate through the lines in the Reader, parsing the document into nested Sections and Blocks.

reader - the Reader holding the source lines of the document document - the empty Document into which the lines will be parsed options - a Hash of options to control processing

returns the Document object



91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
# File 'lib/asciidoctor/parser.rb', line 91

def self.parse(reader, document, options = {})
  block_attributes = parse_document_header(reader, document)

  # NOTE don't use a postfix conditional here as it's known to confuse JRuby in certain circumstances
  unless options[:header_only]
    while reader.has_more_lines?
      new_section, block_attributes = next_section(reader, document, block_attributes)
      if new_section
        document.assign_numeral new_section
        document.blocks << new_section
      end
    end
  end

  document
end

.parse_block_metadata_line(reader, document, attributes, options = {}) ⇒ Object

Internal: Parse the next line if it contains metadata for the following block

This method handles lines with the following content:

  • line or block comment

  • anchor

  • attribute list

  • block title

Any attributes found will be inserted into the attributes argument. If the line contains block metadata, the method returns true, otherwise false.

reader - the source reader document - the current Document attributes - a Hash of attributes in which any metadata found will be stored options - a Hash of options to control processing: (default: {})

*  :text indicates the parser is only looking for text content,
     thus neither a block title or attribute entry should be captured

returns true if the line contains metadata, otherwise falsy



2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
# File 'lib/asciidoctor/parser.rb', line 2021

def self. reader, document, attributes, options = {}
  if (next_line = reader.peek_line) &&
      (options[:text] ? (next_line.start_with? '[', '/') : (normal = next_line.start_with? '[', '.', '/', ':'))
    if next_line.start_with? '['
      if next_line.start_with? '[['
        if (next_line.end_with? ']]') && BlockAnchorRx =~ next_line
          # NOTE registration of id and reftext is deferred until block is processed
          attributes['id'] = $1
          if (reftext = $2)
            attributes['reftext'] = (reftext.include? ATTR_REF_HEAD) ? (document.sub_attributes reftext) : reftext
          end
          return true
        end
      elsif (next_line.end_with? ']') && BlockAttributeListRx =~ next_line
        current_style = attributes[1]
        # extract id, role, and options from first positional attribute and remove, if present
        if (document.parse_attributes $1, [], sub_input: true, sub_result: true, into: attributes)[1]
          attributes[1] = (parse_style_attribute attributes, reader) || current_style
        end
        return true
      end
    elsif normal && (next_line.start_with? '.')
      if BlockTitleRx =~ next_line
        # NOTE title doesn't apply to section, but we need to stash it for the first block
        # TODO should issue an error if this is found above the document title
        attributes['title'] = $1
        return true
      end
    elsif !normal || (next_line.start_with? '/')
      if next_line == '//'
        return true
      elsif normal && (uniform? next_line, '/', (ll = next_line.length))
        unless ll == 3
          reader.read_lines_until terminator: next_line, skip_first_line: true, preserve_last_line: true, skip_processing: true, context: :comment
          return true
        end
      else
        return true unless next_line.start_with? '///'
      end if next_line.start_with? '//'
    # NOTE the final condition can be consolidated into single line
    elsif normal && (next_line.start_with? ':') && AttributeEntryRx =~ next_line
      process_attribute_entry reader, document, attributes, $~
      return true
    end
  end
end

.parse_block_metadata_lines(reader, document, attributes = {}, options = {}) ⇒ Object

Internal: Parse lines of metadata until a line of metadata is not found.

This method processes sequential lines containing block metadata, ignoring blank lines and comments.

reader - the source reader document - the current Document attributes - a Hash of attributes in which any metadata found will be stored (default: {}) options - a Hash of options to control processing: (default: {})

*  :text indicates that parser is only looking for text content
     and thus the block title should not be captured

returns the Hash of attributes including any metadata found



1992
1993
1994
1995
1996
1997
1998
1999
# File 'lib/asciidoctor/parser.rb', line 1992

def self. reader, document, attributes = {}, options = {}
  while  reader, document, attributes, options
    # discard the line just processed
    reader.shift
    reader.skip_blank_lines || break
  end
  attributes
end

.parse_blocks(reader, parent, attributes = nil) ⇒ Object

Public: Parse blocks from this reader until there are no more lines.

This method calls Parser#next_block until there are no more lines in the Reader. It does not consider sections because it’s assumed the Reader only has lines which are within a delimited block region.

reader - The Reader containing the lines to process parent - The parent Block to which to attach the parsed blocks

Returns nothing.



1071
1072
1073
1074
1075
1076
1077
1078
# File 'lib/asciidoctor/parser.rb', line 1071

def self.parse_blocks(reader, parent, attributes = nil)
  if attributes
    while ((block = next_block reader, parent, attributes.merge) && parent.blocks << block) || reader.has_more_lines?; end
  else
    while ((block = next_block reader, parent) && parent.blocks << block) || reader.has_more_lines?; end
  end
  nil
end

.parse_callout_list(reader, match, parent, callouts) ⇒ Object

Internal: Parse and construct a callout list Block from the current position of the Reader and advance the document callouts catalog to the next list.

reader - The Reader from which to retrieve the callout list. match - The Regexp match containing the head of the list. parent - The parent Block to which this callout list belongs. callouts - The document callouts catalog.

Returns the Block that represents the parsed callout list.



1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
# File 'lib/asciidoctor/parser.rb', line 1221

def self.parse_callout_list reader, match, parent, callouts
  list_block = List.new(parent, :colist)
  next_index = 1
  autonum = 0
  # NOTE skip the match on the first time through as we've already done it (emulates begin...while)
  while match || ((match = CalloutListRx.match reader.peek_line) && reader.mark)
    if (num = match[1]) == '.'
      num = (autonum += 1).to_s
    end
    # might want to move this check to a validate method
    unless num == next_index.to_s
      logger.warn message_with_context %(callout list item index: expected #{next_index}, got #{num}), source_location: reader.cursor_at_mark
    end
    if (list_item = parse_list_item reader, list_block, match, '<1>')
      list_block.items << list_item
      if (coids = callouts.callout_ids list_block.items.size).empty?
        logger.warn message_with_context %(no callout found for <#{list_block.items.size}>), source_location: reader.cursor_at_mark
      else
        list_item.attributes['coids'] = coids
      end
    end
    next_index += 1
    match = nil
  end

  callouts.next_list
  list_block
end

.parse_cellspec(line, pos = :end, delimiter = nil) ⇒ Object

Internal: Parse the cell specs for the current cell.

The cell specs dictate the cell’s alignments, styles or filters, colspan, rowspan and/or repeating content.

The default spec when pos == :end is {} since we already know we’re at a delimiter. When pos == :start, we may be at a delimiter, nil indicates we’re not.

returns the Hash of attributes that indicate how to layout and style this cell in the table.



2473
2474
2475
2476
2477
2478
2479
2480
2481
2482
2483
2484
2485
2486
2487
2488
2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
2501
2502
2503
2504
2505
2506
2507
2508
2509
2510
2511
2512
2513
2514
2515
2516
2517
2518
2519
2520
2521
2522
2523
2524
2525
# File 'lib/asciidoctor/parser.rb', line 2473

def self.parse_cellspec(line, pos = :end, delimiter = nil)
  m, rest = nil, ''

  if pos == :start
    if line.include? delimiter
      spec_part, delimiter, rest = line.partition delimiter
      if (m = CellSpecStartRx.match spec_part)
        return [{}, rest] if m[0].empty?
      else
        return [nil, line]
      end
    else
      return [nil, line]
    end
  else # pos == :end
    if (m = CellSpecEndRx.match line)
      # NOTE return the line stripped of trailing whitespace if no cellspec is found in this case
      return [{}, line.rstrip] if m[0].lstrip.empty?
      rest = m.pre_match
    else
      return [{}, line]
    end
  end

  spec = {}
  if m[1]
    colspec, rowspec = m[1].split '.'
    colspec = colspec.nil_or_empty? ? 1 : colspec.to_i
    rowspec = rowspec.nil_or_empty? ? 1 : rowspec.to_i
    if m[2] == '+'
      spec['colspan'] = colspec unless colspec == 1
      spec['rowspan'] = rowspec unless rowspec == 1
    elsif m[2] == '*'
      spec['repeatcol'] = colspec unless colspec == 1
    end
  end

  if m[3]
    colspec, rowspec = m[3].split '.'
    if !colspec.nil_or_empty? && TableCellHorzAlignments.key?(colspec)
      spec['halign'] = TableCellHorzAlignments[colspec]
    end
    if !rowspec.nil_or_empty? && TableCellVertAlignments.key?(rowspec)
      spec['valign'] = TableCellVertAlignments[rowspec]
    end
  end

  if m[4] && TableCellStyles.key?(m[4])
    spec['style'] = TableCellStyles[m[4]]
  end

  [spec, rest]
end

.parse_colspecs(records) ⇒ Object

Internal: Parse the column specs for this table.

The column specs dictate the number of columns, relative width of columns, default alignments for cells in each column, and/or default styles or filters applied to the cells in the column.

Every column spec is guaranteed to have a width

returns a Hash of attributes that specify how to format and layout the cells in the table.



2414
2415
2416
2417
2418
2419
2420
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
2435
2436
2437
2438
2439
2440
2441
2442
2443
2444
2445
2446
2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
# File 'lib/asciidoctor/parser.rb', line 2414

def self.parse_colspecs records
  records = records.delete ' ' if records.include? ' '
  # check for deprecated syntax: single number, equal column spread
  if records == records.to_i.to_s
    return ::Array.new(records.to_i) { { 'width' => 1 } }
  end

  specs = []
  # NOTE -1 argument ensures we don't drop empty records
  ((records.include? ',') ? (records.split ',', -1) : (records.split ';', -1)).each do |record|
    if record.empty?
      specs << { 'width' => 1 }
    # TODO might want to use scan rather than this mega-regexp
    elsif (m = ColumnSpecRx.match(record))
      spec = {}
      if m[2]
        # make this an operation
        colspec, rowspec = m[2].split '.'
        if !colspec.nil_or_empty? && TableCellHorzAlignments.key?(colspec)
          spec['halign'] = TableCellHorzAlignments[colspec]
        end
        if !rowspec.nil_or_empty? && TableCellVertAlignments.key?(rowspec)
          spec['valign'] = TableCellVertAlignments[rowspec]
        end
      end

      if (width = m[3])
        # to_i will strip the optional %
        spec['width'] = width == '~' ? -1 : width.to_i
      else
        spec['width'] = 1
      end

      # make this an operation
      if m[4] && TableCellStyles.key?(m[4])
        spec['style'] = TableCellStyles[m[4]]
      end

      if m[1]
        1.upto(m[1].to_i) { specs << spec.merge }
      else
        specs << spec
      end
    end
  end
  specs
end

.parse_description_list(reader, match, parent) ⇒ Object

Internal: Parse and construct a description list Block from the current position of the Reader

reader - The Reader from which to retrieve the description list match - The Regexp match for the head of the list parent - The parent Block to which this description list belongs

Returns the Block encapsulating the parsed description list



1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
# File 'lib/asciidoctor/parser.rb', line 1193

def self.parse_description_list reader, match, parent
  list_block = List.new parent, :dlist
  # detects a description list item that uses the same delimiter (::, :::, :::: or ;;)
  sibling_pattern = DescriptionListSiblingRx[match[2]]
  list_block.items << (current_pair = parse_list_item reader, list_block, match, sibling_pattern)

  while reader.has_more_lines? && sibling_pattern =~ reader.peek_line
    next_pair = parse_list_item reader, list_block, $~, sibling_pattern
    if current_pair[1]
      list_block.items << (current_pair = next_pair)
    else
      current_pair[0] << next_pair[0][0]
      current_pair[1] = next_pair[1]
    end
  end

  list_block
end

.parse_document_header(reader, document) ⇒ Object

Public: Parses the document header of the AsciiDoc source read from the Reader

Reads the AsciiDoc source from the Reader until the end of the document header is reached. The Document object is populated with information from the header (document title, document attributes, etc). The document attributes are then saved to establish a save point to which to rollback after parsing is complete.

This method assumes that there are no blank lines at the start of the document, which are automatically removed by the reader.

returns the Hash of orphan block attributes captured above the header



120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
# File 'lib/asciidoctor/parser.rb', line 120

def self.parse_document_header(reader, document)
  # capture lines of block-level metadata and plow away comment lines that precede first block
  block_attrs =  reader, document
  doc_attrs = document.attributes

  # special case, block title is not allowed above document title,
  # carry attributes over to the document body
  if (implicit_doctitle = is_next_line_doctitle? reader, block_attrs, doc_attrs['leveloffset']) && block_attrs['title']
    return document.finalize_header block_attrs, false
  end

  # yep, document title logic in AsciiDoc is just insanity
  # definitely an area for spec refinement

  unless (val = doc_attrs['doctitle']).nil_or_empty?
    document.title = doctitle_attr_val = val
  end

  # if the first line is the document title, add a header to the document and parse the header metadata
  if implicit_doctitle
    source_location = reader.cursor if document.sourcemap
    document.id, _, l0_section_title, _, atx = parse_section_title reader, document
    if doctitle_attr_val
      # NOTE doctitle attribute (set above or below implicit doctitle) overrides implicit doctitle
      l0_section_title = nil
    else
      document.title = l0_section_title
      doc_attrs['doctitle'] = doctitle_attr_val = document.apply_header_subs l0_section_title
    end
    document.header.source_location = source_location if source_location
    # default to compat-mode if document has setext doctitle
    doc_attrs['compat-mode'] = '' unless atx || (document.attribute_locked? 'compat-mode')
    if (separator = block_attrs['separator'])
      doc_attrs['title-separator'] = separator unless document.attribute_locked? 'title-separator'
    end
    if (doc_id = block_attrs['id'])
      document.id = doc_id
    else
      doc_id = document.id
    end
    if (role = block_attrs['role'])
      doc_attrs['role'] = role
    end
    if (reftext = block_attrs['reftext'])
      doc_attrs['reftext'] = reftext
    end
    block_attrs.clear
    (modified_attrs = document.instance_variable_get :@attributes_modified).delete 'doctitle'
     reader, document
    if modified_attrs.include? 'doctitle'
      if (val = doc_attrs['doctitle']).nil_or_empty? || val == doctitle_attr_val
        doc_attrs['doctitle'] = doctitle_attr_val
      else
        document.title = val
      end
    elsif !l0_section_title
      modified_attrs << 'doctitle'
    end
    document.register :refs, [doc_id, document] if doc_id
  end

  # parse title and consume name section of manpage document
  parse_manpage_header reader, document, block_attrs if document.doctype == 'manpage'

  # NOTE block_attrs are the block-level attributes (not document attributes) that
  # precede the first line of content (document title, first section or first block)
  document.finalize_header block_attrs
end

.parse_header_metadata(reader, document = nil) ⇒ Object

Public: Consume and parse the two header lines (line 1 = author info, line 2 = revision info).

Returns the Hash of header metadata. If a Document object is supplied, the metadata is applied directly to the attributes of the Document.

reader - the Reader holding the source lines of the document document - the Document we are building (default: nil)

Examples

data = ["Author Name <[email protected]>\n", "v1.0, 2012-12-21: Coincide w/ end of world.\n"]
(Reader.new data, nil, normalize: true)
# => { 'author' => 'Author Name', 'firstname' => 'Author', 'lastname' => 'Name', 'email' => '[email protected]',
#       'revnumber' => '1.0', 'revdate' => '2012-12-21', 'revremark' => 'Coincide w/ end of world.' }


1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
# File 'lib/asciidoctor/parser.rb', line 1782

def self.(reader, document = nil)
  doc_attrs = document && document.attributes
  # NOTE this will discard any comment lines, but not skip blank lines
  process_attribute_entries reader, document

  , implicit_author, implicit_authorinitials = implicit_authors = {}, nil, nil

  if reader.has_more_lines? && !reader.next_line_empty?
    unless ( = process_authors reader.read_line).empty?
      if document
        # apply header subs and assign to document
        .each do |key, val|
          # NOTE the attributes substitution only applies for the email record
          doc_attrs[key] = ::String === val ? (document.apply_header_subs val) : val unless doc_attrs.key? key
        end

        implicit_author = doc_attrs['author']
        implicit_authorinitials = doc_attrs['authorinitials']
        implicit_authors = doc_attrs['authors']
      end

       = 
    end

    # NOTE this will discard any comment lines, but not skip blank lines
    process_attribute_entries reader, document

     = {}

    if reader.has_more_lines? && !reader.next_line_empty?
      rev_line = reader.read_line
      if (match = RevisionInfoLineRx.match(rev_line))
        ['revnumber'] = match[1].rstrip if match[1]
        unless (component = match[2].strip).empty?
          # version must begin with 'v' if date is absent
          if !match[1] && (component.start_with? 'v')
            ['revnumber'] = component.slice 1, component.length
          else
            ['revdate'] = component
          end
        end
        ['revremark'] = match[3].rstrip if match[3]
      else
        # throw it back
        reader.unshift_line rev_line
      end
    end

    unless .empty?
      if document
        # apply header subs and assign to document
        .each do |key, val|
          unless doc_attrs.key? key
            doc_attrs[key] = document.apply_header_subs val
          end
        end
      end

      .update 
    end

    # NOTE this will discard any comment lines, but not skip blank lines
    process_attribute_entries reader, document

    reader.skip_blank_lines
  else
     = {}
  end

  # process author attribute entries that override (or stand in for) the implicit author line
  if document
    if doc_attrs.key?('author') && (author_line = doc_attrs['author']) != implicit_author
      # do not allow multiple, process as names only
       = process_authors author_line, true, false
      .delete 'authorinitials' if doc_attrs['authorinitials'] != implicit_authorinitials
    elsif doc_attrs.key?('authors') && (author_line = doc_attrs['authors']) != implicit_authors
      # allow multiple, process as names only
       = process_authors author_line, true
    else
      authors, author_idx, author_key, explicit, sparse = [], 1, 'author_1', false, false
      while doc_attrs.key? author_key
        # only use indexed author attribute if value is different
        # leaves corner case if line matches with underscores converted to spaces; use double space to force
        if (author_override = doc_attrs[author_key]) == [author_key]
          authors << nil
          sparse = true
        else
          authors << author_override
          explicit = true
        end
        author_key = %(author_#{author_idx += 1})
      end
      if explicit
        # rebuild implicit author names to reparse
        authors.each_with_index do |author, idx|
          unless author
            authors[idx] = [
              [%(firstname_#{name_idx = idx + 1})],
              [%(middlename_#{name_idx})],
              [%(lastname_#{name_idx})]
            ].compact.map {|it| it.tr ' ', '_' }.join ' '
          end
        end if sparse
        # process as names only
         = process_authors authors, true, false
      else
         = {}
      end
    end

    if .empty?
      ['authorcount'] ||= (doc_attrs['authorcount'] = 0)
    else
      doc_attrs.update 

      # special case
      if !doc_attrs.key?('email') && doc_attrs.key?('email_1')
        doc_attrs['email'] = doc_attrs['email_1']
      end
    end
  end

  
end

.parse_list(reader, list_type, parent, style) ⇒ Object

Internal: Parse and construct an ordered or unordered list at the current position of the Reader

reader - The Reader from which to retrieve the list list_type - A Symbol representing the list type (:olist for ordered, :ulist for unordered) parent - The parent Block to which this list belongs style - The block style assigned to this list (optional, default: nil)

Returns the Block encapsulating the parsed unordered or ordered list



1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
# File 'lib/asciidoctor/parser.rb', line 1088

def self.parse_list reader, list_type, parent, style
  list_block = List.new parent, list_type
  list_rx = ListRxMap[list_type]

  while reader.has_more_lines? && list_rx =~ reader.peek_line
    # NOTE parse_list_item will stop at sibling item or end of list; never sees ancestor items
    if (list_item = parse_list_item reader, list_block, $~, $1, style)
      list_block.items << list_item
    end

    reader.skip_blank_lines || break
  end

  list_block
end

.parse_list_item(reader, list_block, match, sibling_trait, style = nil) ⇒ Object

Internal: Parse and construct the next ListItem (unordered, ordered, or callout list) or next term ListItem and description ListItem pair (description list) for the specified list Block.

First, collect and process all the lines that constitute the next list item for the specified list (according to its type). Next, create a ListItem (in the case of a description list, a description ListItem), parse the lines into blocks, and associate those blocks with that ListItem. Finally, fold the first block into the item’s text attribute according to rules described in ListItem.

reader - The Reader from which to retrieve the next list item list_block - The parent list Block for this ListItem. Also provides access to the list type. match - The MatchData that contains the list item marker and first line text of the ListItem sibling_trait - The trait to match a sibling list item. For ordered and unordered lists, this is

a String marker (e.g., '**' or 'ii)'). For description lists, this is a Regexp
marker pattern.

style - The block style assigned to this list (optional, default: nil)

Returns the next ListItem or [[ListItem], ListItem] pair (description list) for the parent list Block.



1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
# File 'lib/asciidoctor/parser.rb', line 1268

def self.parse_list_item(reader, list_block, match, sibling_trait, style = nil)
  if (list_type = list_block.context) == :dlist
    dlist = true
    list_term = ListItem.new(list_block, (term_text = match[1]))
    if term_text.start_with?('[[') && LeadingInlineAnchorRx =~ term_text
      catalog_inline_anchor $1, ($2 || $'.lstrip), list_term, reader
    end
    has_text = true if (item_text = match[3])
    list_item = ListItem.new(list_block, item_text)
    if list_block.document.sourcemap
      list_term.source_location = reader.cursor
      if has_text
        list_item.source_location = list_term.source_location
      else
        sourcemap_assignment_deferred = true
      end
    end
  else
    has_text = true
    list_item = ListItem.new(list_block, (item_text = match[2]))
    list_item.source_location = reader.cursor if list_block.document.sourcemap
    if list_type == :ulist
      list_item.marker = sibling_trait
      if item_text.start_with?('[')
        if style && style == 'bibliography'
          if InlineBiblioAnchorRx =~ item_text
            catalog_inline_biblio_anchor $1, $2, list_item, reader
          end
        elsif item_text.start_with?('[[')
          if LeadingInlineAnchorRx =~ item_text
            catalog_inline_anchor $1, $2, list_item, reader
          end
        elsif item_text.start_with?('[ ] ', '[x] ', '[*] ')
          list_block.set_option 'checklist'
          list_item.attributes['checkbox'] = ''
          list_item.attributes['checked'] = '' unless item_text.start_with? '[ '
          list_item.text = item_text.slice(4, item_text.length)
        end
      end
    elsif list_type == :olist
      sibling_trait, implicit_style = resolve_ordered_list_marker(sibling_trait, (ordinal = list_block.items.size), true, reader)
      list_item.marker = sibling_trait
      if ordinal == 0 && !style
        # using list level makes more sense, but we don't track it
        # basing style on marker level is compliant with AsciiDoc Python
        list_block.style = implicit_style || ((ORDERED_LIST_STYLES[sibling_trait.length - 1] || 'arabic').to_s)
      end
      if item_text.start_with?('[[') && LeadingInlineAnchorRx =~ item_text
        catalog_inline_anchor $1, $2, list_item, reader
      end
    else # :colist
      list_item.marker = sibling_trait
      if item_text.start_with?('[[') && LeadingInlineAnchorRx =~ item_text
        catalog_inline_anchor $1, $2, list_item, reader
      end
    end
  end

  # first skip the line with the marker / term (it gets put back onto the reader by next_block)
  reader.shift
  block_cursor = reader.cursor
  list_item_reader = Reader.new read_lines_for_list_item(reader, list_type, sibling_trait, has_text), block_cursor
  if list_item_reader.has_more_lines?
    list_item.source_location = block_cursor if sourcemap_assignment_deferred
    # NOTE peek on the other side of any comment lines
    comment_lines = list_item_reader.skip_line_comments
    if (subsequent_line = list_item_reader.peek_line)
      list_item_reader.unshift_lines comment_lines unless comment_lines.empty?
      if (continuation_connects_first_block = subsequent_line.empty?)
        content_adjacent = false
      else
        content_adjacent = true
        # treat lines as paragraph text if continuation does not connect first block (i.e., has_text = nil)
        has_text = nil unless dlist
      end
    else
      # NOTE we have no use for any trailing comment lines we might have found
      continuation_connects_first_block = false
      content_adjacent = false
    end

    # reader is confined to boundaries of list, which means only blocks will be found (no sections)
    if (block = next_block(list_item_reader, list_item, {}, text: !has_text, list_item: true))
      list_item.blocks << block
    end

    while list_item_reader.has_more_lines?
      if (block = next_block(list_item_reader, list_item, {}, list_item: true))
        list_item.blocks << block
      end
    end

    list_item.fold_first(continuation_connects_first_block, content_adjacent)
  end

  dlist ? [[list_term], (list_item.text? || list_item.blocks? ? list_item : nil)] : list_item
end

.parse_manpage_header(reader, document, block_attributes) ⇒ Object

Public: Parses the manpage header of the AsciiDoc source read from the Reader

returns Nothing



192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
# File 'lib/asciidoctor/parser.rb', line 192

def self.parse_manpage_header(reader, document, block_attributes)
  if ManpageTitleVolnumRx =~ (doc_attrs = document.attributes)['doctitle']
    doc_attrs['manvolnum'] = manvolnum = $2
    doc_attrs['mantitle'] = (((mantitle = $1).include? ATTR_REF_HEAD) ? (document.sub_attributes mantitle) : mantitle).downcase
  else
    logger.error message_with_context 'non-conforming manpage title', source_location: (reader.cursor_at_line 1)
    # provide sensible fallbacks
    doc_attrs['mantitle'] = doc_attrs['doctitle'] || doc_attrs['docname'] || 'command'
    doc_attrs['manvolnum'] = manvolnum = '1'
  end
  if (manname = doc_attrs['manname']) && doc_attrs['manpurpose']
    doc_attrs['manname-title'] ||= 'Name'
    doc_attrs['mannames'] = [manname]
    if document.backend == 'manpage'
      doc_attrs['docname'] = manname
      doc_attrs['outfilesuffix'] = %(.#{manvolnum})
    end
  else
    reader.skip_blank_lines
    reader.save
    block_attributes.update  reader, document
    if (name_section_level = is_next_line_section? reader, {})
      if name_section_level == 1
        name_section = initialize_section reader, document, {}
        name_section_buffer = (reader.read_lines_until break_on_blank_lines: true, skip_line_comments: true).map {|l| l.lstrip }.join ' '
        if ManpageNamePurposeRx =~ name_section_buffer
          doc_attrs['manname-title'] ||= name_section.title
          doc_attrs['manname-id'] = name_section.id if name_section.id
          doc_attrs['manpurpose'] = $2
          if (manname = $1).include? ATTR_REF_HEAD
            manname = document.sub_attributes manname
          end
          if manname.include? ','
            manname = (mannames = (manname.split ',').map {|n| n.lstrip })[0]
          else
            mannames = [manname]
          end
          doc_attrs['manname'] = manname
          doc_attrs['mannames'] = mannames
          if document.backend == 'manpage'
            doc_attrs['docname'] = manname
            doc_attrs['outfilesuffix'] = %(.#{manvolnum})
          end
        else
          error_msg = 'non-conforming name section body'
        end
      else
        error_msg = 'name section must be at level 1'
      end
    else
      error_msg = 'name section expected'
    end
    if error_msg
      reader.restore_save
      logger.error message_with_context error_msg, source_location: reader.cursor
      doc_attrs['manname'] = manname = doc_attrs['docname'] || 'command'
      doc_attrs['mannames'] = [manname]
      if document.backend == 'manpage'
        doc_attrs['docname'] = manname
        doc_attrs['outfilesuffix'] = %(.#{manvolnum})
      end
    else
      reader.discard_save
    end
  end
  nil
end

.parse_section_title(reader, document, sect_id = nil) ⇒ Object

Internal: Parse the section title from the current position of the reader

Parse an atx (single-line) or setext (underlined) section title. After this method is called, the Reader will be positioned at the line after the section title.

For efficiency, we don’t reuse methods internally that check for a section title.

reader - the source [Reader], positioned at a section title. document - the current [Document].

Examples

reader.lines
# => ["Foo", "~~~"]

id, reftext, title, level, atx = parse_section_title(reader, document)

title
# => "Foo"
level
# => 2
id
# => nil
atx
# => false

line1
# => "==== Foo"

id, reftext, title, level, atx = parse_section_title(reader, document)

title
# => "Foo"
level
# => 3
id
# => nil
atx
# => true

Returns an 5-element [Array] containing the id (String), reftext (String), title (String), level (Integer), and flag (Boolean) indicating whether an atx section title was matched, or nothing.



1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
# File 'lib/asciidoctor/parser.rb', line 1739

def self.parse_section_title(reader, document, sect_id = nil)
  sect_reftext = nil
  line1 = reader.read_line

  if Compliance.markdown_syntax ? ((line1.start_with? '=', '#') && ExtAtxSectionTitleRx =~ line1) :
      ((line1.start_with? '=') && AtxSectionTitleRx =~ line1)
    # NOTE level is 1 less than number of line markers
    sect_level, sect_title, atx = $1.length - 1, $2, true
    if sect_title.end_with?(']]') && InlineSectionAnchorRx =~ sect_title && !$1 # escaped
      sect_title, sect_id, sect_reftext = (sect_title.slice 0, sect_title.length - $&.length), $2, $3
    end unless sect_id
  elsif Compliance.underline_style_section_titles && (line2 = reader.peek_line(true)) &&
      (sect_level = SETEXT_SECTION_LEVELS[line2_ch0 = line2.chr]) && (uniform? line2, line2_ch0, (line2_len = line2.length)) &&
      (sect_title = SetextSectionTitleRx =~ line1 && $1) && (line1.length - line2_len).abs < 2
    atx = false
    if sect_title.end_with?(']]') && InlineSectionAnchorRx =~ sect_title && !$1 # escaped
      sect_title, sect_id, sect_reftext = (sect_title.slice 0, sect_title.length - $&.length), $2, $3
    end unless sect_id
    reader.shift
  else
    raise %(Unrecognized section at #{reader.cursor_at_prev_line})
  end
  if document.attr? 'leveloffset'
    sect_level += (document.attr 'leveloffset').to_i
    sect_level = 0 if sect_level < 0
  end
  [sect_id, sect_reftext, sect_title, sect_level, atx]
end

.parse_style_attribute(attributes, reader = nil) ⇒ Object

Public: Parse the first positional attribute and assign named attributes

Parse the first positional attribute to extract the style, role and id parts, assign the values to their cooresponding attribute keys and return the parsed style from the first positional attribute.

attributes - The Hash of attributes to process and update

Examples

puts attributes
=> { 1 => "abstract#intro.lead%fragment", "style" => "preamble" }

parse_style_attribute(attributes)
=> "abstract"

puts attributes
=> { 1 => "abstract#intro.lead%fragment", "style" => "abstract", "id" => "intro",
      "role" => "lead", "options" => "fragment", "fragment-option" => '' }

Returns the String style parsed from the first positional attribute



2548
2549
2550
2551
2552
2553
2554
2555
2556
2557
2558
2559
2560
2561
2562
2563
2564
2565
2566
2567
2568
2569
2570
2571
2572
2573
2574
2575
2576
2577
2578
2579
2580
2581
2582
2583
2584
2585
2586
2587
2588
2589
2590
2591
2592
2593
2594
2595
2596
2597
2598
2599
# File 'lib/asciidoctor/parser.rb', line 2548

def self.parse_style_attribute attributes, reader = nil
  # NOTE spaces are not allowed in shorthand, so if we detect one, this ain't no shorthand
  if (raw_style = attributes[1]) && !raw_style.include?(' ') && Compliance.shorthand_property_syntax
    name = nil
    accum = ''
    parsed_attrs = {}

    raw_style.each_char do |c|
      case c
      when '.'
        yield_buffered_attribute parsed_attrs, name, accum, reader
        accum = ''
        name = :role
      when '#'
        yield_buffered_attribute parsed_attrs, name, accum, reader
        accum = ''
        name = :id
      when '%'
        yield_buffered_attribute parsed_attrs, name, accum, reader
        accum = ''
        name = :option
      else
        accum = accum + c
      end
    end

    # small optimization if no shorthand is found
    if name
      yield_buffered_attribute parsed_attrs, name, accum, reader

      if (parsed_style = parsed_attrs[:style])
        attributes['style'] = parsed_style
      end

      attributes['id'] = parsed_attrs[:id] if parsed_attrs.key? :id

      if parsed_attrs.key? :role
        attributes['role'] = (existing_role = attributes['role']).nil_or_empty? ? (parsed_attrs[:role].join ' ') : %(#{existing_role} #{parsed_attrs[:role].join ' '})
      end

      if parsed_attrs.key? :option
        (opts = parsed_attrs[:option]).each {|opt| attributes[%(#{opt}-option)] = '' }
      end

      parsed_style
    else
      attributes['style'] = raw_style
    end
  else
    attributes['style'] = raw_style
  end
end

.parse_table(table_reader, parent, attributes) ⇒ Object

Internal: Parse the table contained in the provided Reader

table_reader - a Reader containing the source lines of an AsciiDoc table parent - the parent Block of this Asciidoctor::Table attributes - attributes captured from above this Block

returns an instance of Asciidoctor::Table parsed from the provided reader



2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
2370
2371
2372
2373
2374
2375
2376
2377
2378
2379
2380
2381
2382
2383
2384
2385
2386
2387
2388
2389
2390
2391
2392
2393
2394
2395
2396
2397
2398
2399
2400
2401
# File 'lib/asciidoctor/parser.rb', line 2269

def self.parse_table(table_reader, parent, attributes)
  table = Table.new(parent, attributes)
  if attributes.key? 'title'
    table.title = attributes.delete 'title'
    table.assign_caption(attributes.delete 'caption')
  end

  if (attributes.key? 'cols') && !(colspecs = parse_colspecs attributes['cols']).empty?
    table.create_columns colspecs
    explicit_colspecs = true
  end

  skipped = table_reader.skip_blank_lines || 0
  parser_ctx = Table::ParserContext.new table_reader, table, attributes
  format, loop_idx, implicit_header_boundary = parser_ctx.format, -1, nil
  implicit_header = true unless skipped > 0 || attributes['header-option'] || attributes['noheader-option']

  while (line = table_reader.read_line)
    if (beyond_first = (loop_idx += 1) > 0) && line.empty?
      line = nil
      implicit_header_boundary += 1 if implicit_header_boundary
    elsif format == 'psv'
      if parser_ctx.starts_with_delimiter? line
        line = line.slice 1, line.length
        # push empty cell spec if cell boundary appears at start of line
        parser_ctx.close_open_cell
        implicit_header_boundary = nil if implicit_header_boundary
      else
        next_cellspec, line = parse_cellspec line, :start, parser_ctx.delimiter
        # if cellspec is not nil, we're at a cell boundary
        if next_cellspec
          parser_ctx.close_open_cell next_cellspec
          implicit_header_boundary = nil if implicit_header_boundary
        # otherwise, the cell continues from previous line
        elsif implicit_header_boundary && implicit_header_boundary == loop_idx
          implicit_header, implicit_header_boundary = false, nil
        end
      end
    end

    unless beyond_first
      table_reader.mark
      # NOTE implicit header is offset by at least one blank line; implicit_header_boundary tracks size of gap
      if implicit_header
        if table_reader.has_more_lines? && table_reader.peek_line.empty?
          implicit_header_boundary = 1
        else
          implicit_header = false
        end
      end
    end

    # this loop is used for flow control; internal logic controls how many times it executes
    while true
      if line && (m = parser_ctx.match_delimiter line)
        pre_match, post_match = m.pre_match, m.post_match
        case format
        when 'csv'
          if parser_ctx.buffer_has_unclosed_quotes? pre_match
            parser_ctx.skip_past_delimiter pre_match
            break if (line = post_match).empty?
            redo
          end
          parser_ctx.buffer = %(#{parser_ctx.buffer}#{pre_match})
        when 'dsv'
          if pre_match.end_with? '\\'
            parser_ctx.skip_past_escaped_delimiter pre_match
            if (line = post_match).empty?
              parser_ctx.buffer = %(#{parser_ctx.buffer}#{LF})
              parser_ctx.keep_cell_open
              break
            end
            redo
          end
          parser_ctx.buffer = %(#{parser_ctx.buffer}#{pre_match})
        else # psv
          if pre_match.end_with? '\\'
            parser_ctx.skip_past_escaped_delimiter pre_match
            if (line = post_match).empty?
              parser_ctx.buffer = %(#{parser_ctx.buffer}#{LF})
              parser_ctx.keep_cell_open
              break
            end
            redo
          end
          next_cellspec, cell_text = parse_cellspec pre_match
          parser_ctx.push_cellspec next_cellspec
          parser_ctx.buffer = %(#{parser_ctx.buffer}#{cell_text})
        end
        # don't break if empty to preserve empty cell found at end of line (see issue #1106)
        line = nil if (line = post_match).empty?
        parser_ctx.close_cell
      else
        # no other delimiters to see here; suck up this line into the buffer and move on
        parser_ctx.buffer = %(#{parser_ctx.buffer}#{line}#{LF})
        case format
        when 'csv'
          if parser_ctx.buffer_has_unclosed_quotes?
            implicit_header, implicit_header_boundary = false, nil if implicit_header_boundary && loop_idx == 0
            parser_ctx.keep_cell_open
          else
            parser_ctx.close_cell true
          end
        when 'dsv'
          parser_ctx.close_cell true
        else # psv
          parser_ctx.keep_cell_open
        end
        break
      end
    end

    # NOTE cell may already be closed if table format is csv or dsv
    if parser_ctx.cell_open?
      parser_ctx.close_cell true unless table_reader.has_more_lines?
    else
      table_reader.skip_blank_lines || break
    end
  end

  unless (table.attributes['colcount'] ||= table.columns.size) == 0 || explicit_colspecs
    table.assign_column_widths
  end

  if implicit_header
    table.has_header_option = true
    attributes['header-option'] = ''
  end

  table.partition_header_footer attributes

  table
end

.process_attribute_entries(reader, document, attributes = nil) ⇒ Object

Process consecutive attribute entry lines, ignoring adjacent line comments and comment blocks.

Returns nothing



2071
2072
2073
2074
2075
2076
2077
2078
# File 'lib/asciidoctor/parser.rb', line 2071

def self.process_attribute_entries reader, document, attributes = nil
  reader.skip_comment_lines
  while process_attribute_entry reader, document, attributes
    # discard line just processed
    reader.shift
    reader.skip_comment_lines
  end
end

.process_attribute_entry(reader, document, attributes = nil, match = nil) ⇒ Object



2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
# File 'lib/asciidoctor/parser.rb', line 2080

def self.process_attribute_entry reader, document, attributes = nil, match = nil
  if match || (match = reader.has_more_lines? ? (AttributeEntryRx.match reader.peek_line) : nil)
    if (value = match[2]).nil_or_empty?
      value = ''
    elsif value.end_with? LINE_CONTINUATION, LINE_CONTINUATION_LEGACY
      con, value = (value.slice value.length - 2, 2), (value.slice 0, value.length - 2).rstrip
      while reader.advance && !(next_line = reader.peek_line || '').empty?
        next_line = next_line.lstrip
        next_line = (next_line.slice 0, next_line.length - 2).rstrip if (keep_open = next_line.end_with? con)
        value = %(#{value}#{(value.end_with? HARD_LINE_BREAK) ? LF : ' '}#{next_line})
        break unless keep_open
      end
    end

    store_attribute match[1], value, document, attributes
    true
  end
end

.process_authors(author_line, names_only = false, multiple = true) ⇒ Object

Internal: Parse the author line into a Hash of author metadata

author_line - the String author line names_only - a Boolean flag that indicates whether to process line as

names only or names with emails (default: false)

multiple - a Boolean flag that indicates whether to process multiple

semicolon-separated entries in the author line (default: true)

returns a Hash of author metadata



1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
# File 'lib/asciidoctor/parser.rb', line 1916

def self.process_authors author_line, names_only = false, multiple = true
   = {}
  author_idx = 0
  (multiple && (author_line.include? ';') ? (author_line.split AuthorDelimiterRx) : [*author_line]).each do |author_entry|
    next if author_entry.empty?
    key_map = {}
    if (author_idx += 1) == 1
      AuthorKeys.each {|key| key_map[key.to_sym] = key }
    else
      AuthorKeys.each {|key| key_map[key.to_sym] = %(#{key}_#{author_idx}) }
    end

    if names_only # when parsing an attribute value
      # QUESTION should we rstrip author_entry?
      if author_entry.include? '<'
        [key_map[:author]] = author_entry.tr('_', ' ')
        author_entry = author_entry.gsub XmlSanitizeRx, ''
      end
      # NOTE split names and collapse repeating whitespace (split drops any leading whitespace)
      if (segments = author_entry.split nil, 3).size == 3
        segments << (segments.pop.squeeze ' ')
      end
    elsif (match = AuthorInfoLineRx.match(author_entry))
      (segments = match.to_a).shift
    end

    if segments
      author = [key_map[:firstname]] = fname = segments[0].tr('_', ' ')
      [key_map[:authorinitials]] = fname.chr
      if segments[1]
        if segments[2]
          [key_map[:middlename]] = mname = segments[1].tr('_', ' ')
          [key_map[:lastname]] = lname = segments[2].tr('_', ' ')
          author = fname + ' ' + mname + ' ' + lname
          [key_map[:authorinitials]] = %(#{fname.chr}#{mname.chr}#{lname.chr})
        else
          [key_map[:lastname]] = lname = segments[1].tr('_', ' ')
          author = fname + ' ' + lname
          [key_map[:authorinitials]] = %(#{fname.chr}#{lname.chr})
        end
      end
      [key_map[:author]] ||= author
      [key_map[:email]] = segments[3] unless names_only || !segments[3]
    else
      [key_map[:author]] = [key_map[:firstname]] = fname = author_entry.squeeze(' ').strip
      [key_map[:authorinitials]] = fname.chr
    end

    if author_idx == 1
      ['authors'] = [key_map[:author]]
    else
      # only assign the _1 attributes once we see the second author
      if author_idx == 2
        AuthorKeys.each {|key| [%(#{key}_1)] = [key] if .key? key }
      end
      ['authors'] = %(#{['authors']}, #{[key_map[:author]]})
    end
  end

  ['authorcount'] = author_idx
  
end

.read_lines_for_list_item(reader, list_type, sibling_trait = nil, has_text = true) ⇒ Object

Internal: Collect the lines belonging to the current list item, navigating through all the rules that determine what comprises a list item.

Grab lines until a sibling list item is found, or the block is broken by a terminator (such as a line comment). Description lists are more greedy if they don’t have optional inline item text…they want that text

reader - The Reader from which to retrieve the lines. list_type - The Symbol context of the list (:ulist, :olist, :colist or :dlist) sibling_trait - A Regexp that matches a sibling of this list item or String list marker

of the items in this list (default: nil)

has_text - Whether the list item has text defined inline (always true except for description lists)

Returns an Array of lines belonging to the current list item.



1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
# File 'lib/asciidoctor/parser.rb', line 1380

def self.read_lines_for_list_item(reader, list_type, sibling_trait = nil, has_text = true)
  buffer = []

  # three states for continuation: :inactive, :active & :frozen
  # :frozen signifies we've detected sequential continuation lines &
  # continuation is not permitted until reset
  continuation = :inactive

  # if we are within a nested list, we don't throw away the list
  # continuation marks because they will be processed when grabbing
  # the lines for those nested lists
  within_nested_list = false

  # a detached continuation is a list continuation that follows a blank line
  # it gets associated with the outermost block
  detached_continuation = nil

  dlist = list_type == :dlist

  while reader.has_more_lines?
    this_line = reader.read_line

    # if we've arrived at a sibling item in this list, we've captured
    # the complete list item and can begin processing it
    # the remainder of the method determines whether we've reached
    # the termination of the list
    break if is_sibling_list_item?(this_line, list_type, sibling_trait)

    prev_line = buffer.empty? ? nil : buffer[-1]

    if prev_line == LIST_CONTINUATION
      if continuation == :inactive
        continuation = :active
        has_text = true
        buffer[-1] = '' unless within_nested_list
      end

      # dealing with adjacent list continuations (which is really a syntax error)
      if this_line == LIST_CONTINUATION
        if continuation != :frozen
          continuation = :frozen
          buffer << this_line
        end
        this_line = nil
        next
      end
    end

    # a delimited block immediately breaks the list unless preceded
    # by a list continuation (they are harsh like that ;0)
    if (match = is_delimited_block?(this_line, true))
      if continuation == :active
        buffer << this_line
        # grab all the lines in the block, leaving the delimiters in place
        # we're being more strict here about the terminator, but I think that's a good thing
        buffer.concat reader.read_lines_until(terminator: match.terminator, read_last_line: true, context: nil)
        continuation = :inactive
      else
        break
      end
    # technically BlockAttributeLineRx only breaks if ensuing line is not a list item
    # which really means BlockAttributeLineRx only breaks if it's acting as a block delimiter
    # FIXME to be AsciiDoc compliant, we shouldn't break if style in attribute line is "literal" (i.e., [literal])
    elsif dlist && continuation != :active && (BlockAttributeLineRx.match? this_line)
      break
    else
      if continuation == :active && !this_line.empty?
        # literal paragraphs have special considerations (and this is one of
        # two entry points into one)
        # if we don't process it as a whole, then a line in it that looks like a
        # list item will throw off the exit from it
        if LiteralParagraphRx.match? this_line
          reader.unshift_line this_line
          if dlist
            # we may be in an indented list disguised as a literal paragraph
            # so we need to make sure we don't slurp up a legitimate sibling
            buffer.concat reader.read_lines_until(preserve_last_line: true, break_on_blank_lines: true, break_on_list_continuation: true) {|line| is_sibling_list_item? line, list_type, sibling_trait }
          else
            buffer.concat reader.read_lines_until(preserve_last_line: true, break_on_blank_lines: true, break_on_list_continuation: true)
          end
          continuation = :inactive
        # let block metadata play out until we find the block
        elsif (BlockTitleRx.match? this_line) || (BlockAttributeLineRx.match? this_line) || (AttributeEntryRx.match? this_line)
          buffer << this_line
        else
          if nested_list_type = (within_nested_list ? [:dlist] : NESTABLE_LIST_CONTEXTS).find {|ctx| ListRxMap[ctx].match? this_line }
            within_nested_list = true
            if nested_list_type == :dlist && $3.nil_or_empty?
              # get greedy again
              has_text = false
            end
          end
          buffer << this_line
          continuation = :inactive
        end
      elsif prev_line && prev_line.empty?
        # advance to the next line of content
        if this_line.empty?
          # stop reading if we reach eof
          break unless (this_line = reader.skip_blank_lines && reader.read_line)
          # stop reading if we hit a sibling list item
          break if is_sibling_list_item? this_line, list_type, sibling_trait
        end

        if this_line == LIST_CONTINUATION
          detached_continuation = buffer.size
          buffer << this_line
        else
          # has_text is only relevant for dlist, which is more greedy until it has text for an item
          # for all other lists, has_text is always true
          # in this block, we have to see whether we stay in the list
          if has_text
            # TODO any way to combine this with the check after skipping blank lines?
            if is_sibling_list_item?(this_line, list_type, sibling_trait)
              break
            elsif nested_list_type = NESTABLE_LIST_CONTEXTS.find {|ctx| ListRxMap[ctx] =~ this_line }
              buffer << this_line
              within_nested_list = true
              if nested_list_type == :dlist && $3.nil_or_empty?
                # get greedy again
                has_text = false
              end
            # slurp up any literal paragraph offset by blank lines
            # NOTE we have to check for indented list items first
            elsif LiteralParagraphRx.match? this_line
              reader.unshift_line this_line
              if dlist
                # we may be in an indented list disguised as a literal paragraph
                # so we need to make sure we don't slurp up a legitimate sibling
                buffer.concat reader.read_lines_until(preserve_last_line: true, break_on_blank_lines: true, break_on_list_continuation: true) {|line| is_sibling_list_item? line, list_type, sibling_trait }
              else
                buffer.concat reader.read_lines_until(preserve_last_line: true, break_on_blank_lines: true, break_on_list_continuation: true)
              end
            else
              break
            end
          else # only dlist in need of item text, so slurp it up!
            # pop the blank line so it's not interpretted as a list continuation
            buffer.pop unless within_nested_list
            buffer << this_line
            has_text = true
          end
        end
      else
        has_text = true if !this_line.empty?
        if nested_list_type = (within_nested_list ? [:dlist] : NESTABLE_LIST_CONTEXTS).find {|ctx| ListRxMap[ctx] =~ this_line }
          within_nested_list = true
          if nested_list_type == :dlist && $3.nil_or_empty?
            # get greedy again
            has_text = false
          end
        end
        buffer << this_line
      end
    end
    this_line = nil
  end

  reader.unshift_line this_line if this_line

  if detached_continuation
    buffer.delete_at detached_continuation
  end

  # strip trailing blank lines to prevent empty blocks
  buffer.pop while !buffer.empty? && buffer[-1].empty?

  # We do need to replace the optional trailing continuation
  # a blank line would have served the same purpose in the document
  buffer.pop if !buffer.empty? && buffer[-1] == LIST_CONTINUATION

  #warn "BUFFER[#{list_type},#{sibling_trait}]>#{buffer.join LF}<BUFFER"
  #warn "BUFFER[#{list_type},#{sibling_trait}]>#{buffer.inspect}<BUFFER"

  buffer
end

.read_paragraph_lines(reader, break_at_list, opts = {}) ⇒ Object



925
926
927
928
929
930
931
932
933
# File 'lib/asciidoctor/parser.rb', line 925

def self.read_paragraph_lines reader, break_at_list, opts = {}
  opts[:break_on_blank_lines] = true
  opts[:break_on_list_continuation] = true
  opts[:preserve_last_line] = true
  break_condition = (break_at_list ?
      (Compliance.block_terminates_paragraph ? StartOfBlockOrListProc : StartOfListProc) :
      (Compliance.block_terminates_paragraph ? StartOfBlockProc : NoOp))
  reader.read_lines_until opts, &break_condition
end

.resolve_list_marker(list_type, marker, ordinal = 0, validate = false, reader = nil) ⇒ Object

Internal: Resolve the 0-index marker for this list item

For ordered lists, match the marker used for this list item against the known list markers and determine which marker is the first (0-index) marker in its number series.

For callout lists, return <1>.

For bulleted lists, return the marker as passed to this method.

list_type - The Symbol context of the list marker - The String marker for this list item ordinal - The position of this list item in the list validate - Whether to validate the value of the marker

Returns the String 0-index marker for this list item



2167
2168
2169
2170
2171
2172
2173
2174
2175
# File 'lib/asciidoctor/parser.rb', line 2167

def self.resolve_list_marker(list_type, marker, ordinal = 0, validate = false, reader = nil)
  if list_type == :ulist
    marker
  elsif list_type == :olist
    resolve_ordered_list_marker(marker, ordinal, validate, reader)[0]
  else # :colist
    '<1>'
  end
end

.resolve_ordered_list_marker(marker, ordinal = 0, validate = false, reader = nil) ⇒ Object

Internal: Resolve the 0-index marker for this ordered list item

Match the marker used for this ordered list item against the known ordered list markers and determine which marker is the first (0-index) marker in its number series.

The purpose of this method is to normalize the implicit numbered markers so that they can be compared against other list items.

marker - The marker used for this list item ordinal - The 0-based index of the list item (default: 0) validate - Perform validation that the marker provided is the proper

marker in the sequence (default: false)

Examples

marker = 'B.'
Parser.resolve_ordered_list_marker(marker, 1, true, reader)
# => ['A.', :upperalpha]

marker = '.'
Parser.resolve_ordered_list_marker(marker, 1, true, reader)
# => ['.']

Returns a tuple that contains the String of the first marker in this number series and the implicit list style, if applicable



2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
# File 'lib/asciidoctor/parser.rb', line 2203

def self.resolve_ordered_list_marker(marker, ordinal = 0, validate = false, reader = nil)
  return [marker] if marker.start_with? '.'
  # NOTE case statement is guaranteed to match one of the conditions
  case (style = ORDERED_LIST_STYLES.find {|s| OrderedListMarkerRxMap[s].match? marker })
  when :arabic
    if validate
      expected = ordinal + 1
      actual = marker.to_i # remove trailing . and coerce to int
    end
    marker = '1.'
  when :loweralpha
    if validate
      expected = ('a'[0].ord + ordinal).chr
      actual = marker.chop # remove trailing .
    end
    marker = 'a.'
  when :upperalpha
    if validate
      expected = ('A'[0].ord + ordinal).chr
      actual = marker.chop # remove trailing .
    end
    marker = 'A.'
  when :lowerroman
    if validate
      expected = Helpers.int_to_roman(ordinal + 1).downcase
      actual = marker.chop # remove trailing )
    end
    marker = 'i)'
  when :upperroman
    if validate
      expected = Helpers.int_to_roman(ordinal + 1)
      actual = marker.chop # remove trailing )
    end
    marker = 'I)'
  end

  if validate && expected != actual
    logger.warn message_with_context %(list item index: expected #{expected}, got #{actual}), source_location: reader.cursor
  end

  [marker, style]
end

.sanitize_attribute_name(name) ⇒ Object

Internal: Convert a string to a legal attribute name.

name - the String name of the attribute

Returns a String with the legal AsciiDoc attribute name.

Examples

sanitize_attribute_name('Foo Bar')
=> 'foobar'

sanitize_attribute_name('foo')
=> 'foo'

sanitize_attribute_name('Foo 3 #-Billy')
=> 'foo3-billy'


2753
2754
2755
# File 'lib/asciidoctor/parser.rb', line 2753

def self.sanitize_attribute_name(name)
  name.gsub(InvalidAttributeNameCharsRx, '').downcase
end

.setext_section_title?(line1, line2) ⇒ Boolean

Checks whether the lines given are an setext section title.

line1 - [String] candidate title line2 - [String] candidate underline

Returns the [Integer] section level if these lines are an setext section title, otherwise nothing.

Returns:

  • (Boolean)


1689
1690
1691
1692
1693
1694
# File 'lib/asciidoctor/parser.rb', line 1689

def self.setext_section_title? line1, line2
  if (level = SETEXT_SECTION_LEVELS[line2_ch0 = line2.chr]) && (uniform? line2, line2_ch0, (line2_len = line2.length)) &&
      (SetextSectionTitleRx.match? line1) && (line1.length - line2_len).abs < 2
    level
  end
end

.store_attribute(name, value, doc = nil, attrs = nil) ⇒ Object

Public: Store the attribute in the document and register attribute entry if accessible

name - the String name of the attribute to store;

if name begins or ends with !, it signals to remove the attribute with that root name

value - the String value of the attribute to store doc - the Document being parsed attrs - the attributes for the current context

returns a 2-element array containing the resolved attribute name (minus the ! indicator) and value



2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
# File 'lib/asciidoctor/parser.rb', line 2108

def self.store_attribute name, value, doc = nil, attrs = nil
  # TODO move processing of attribute value to utility method
  if name.end_with? '!'
    # a nil value signals the attribute should be deleted (unset)
    name = name.chop
    value = nil
  elsif name.start_with? '!'
    # a nil value signals the attribute should be deleted (unset)
    name = (name.slice 1, name.length)
    value = nil
  end

  if (name = sanitize_attribute_name name) == 'numbered'
    name = 'sectnums'
  elsif name == 'hardbreaks'
    name = 'hardbreaks-option'
  end

  if doc
    if value
      if name == 'leveloffset'
        # support relative leveloffset values
        if value.start_with? '+'
          value = ((doc.attr 'leveloffset', 0).to_i + (value.slice 1, value.length).to_i).to_s
        elsif value.start_with? '-'
          value = ((doc.attr 'leveloffset', 0).to_i - (value.slice 1, value.length).to_i).to_s
        end
      end
      # QUESTION should we set value to locked value if set_attribute returns false?
      if (resolved_value = doc.set_attribute name, value)
        value = resolved_value
        (Document::AttributeEntry.new name, value).save_to attrs if attrs
      end
    elsif (doc.delete_attribute name) && attrs
      (Document::AttributeEntry.new name, value).save_to attrs
    end
  elsif attrs
    (Document::AttributeEntry.new name, value).save_to attrs
  end

  [name, value]
end

.uniform?(str, chr, len) ⇒ Boolean

Returns:

  • (Boolean)


2733
2734
2735
# File 'lib/asciidoctor/parser.rb', line 2733

def self.uniform? str, chr, len
  (str.count chr) == len
end

.yield_buffered_attribute(attrs, name, value, reader) ⇒ Object

Internal: Save the collected attribute (:id, :option, :role, or nil for :style) in the attribute Hash.



2602
2603
2604
2605
2606
2607
2608
2609
2610
2611
2612
2613
2614
2615
2616
2617
2618
2619
2620
2621
2622
2623
2624
2625
2626
# File 'lib/asciidoctor/parser.rb', line 2602

def self.yield_buffered_attribute attrs, name, value, reader
  if name
    if value.empty?
      if reader
        logger.warn message_with_context %(invalid empty #{name} detected in style attribute), source_location: reader.cursor_at_prev_line
      else
        logger.warn %(invalid empty #{name} detected in style attribute)
      end
    elsif name == :id
      if attrs.key? :id
        if reader
          logger.warn message_with_context 'multiple ids detected in style attribute', source_location: reader.cursor_at_prev_line
        else
          logger.warn 'multiple ids detected in style attribute'
        end
      end
      attrs[name] = value
    else
      (attrs[name] ||= []) << value
    end
  else
    attrs[:style] = value unless value.empty?
  end
  nil
end