Class: Asciidoctor::Parser

Inherits:
Object
  • Object
show all
Includes:
Logging
Defined in:
lib/asciidoctor/parser.rb

Overview

> Asciidoctor::Block

Defined Under Namespace

Classes: BlockMatchData

Constant Summary collapse

TAB =

String for matching tab character

?\t
TabIndentRx =

Regexp for leading tab indentation

/^\t+/
StartOfBlockProc =
proc {|l| ((l.start_with? '[') && (BlockAttributeLineRx.match? l)) || (is_delimited_block? l) }
StartOfListProc =
proc {|l| AnyListRx.match? l }
StartOfBlockOrListProc =
proc {|l| (is_delimited_block? l) || ((l.start_with? '[') && (BlockAttributeLineRx.match? l)) || (AnyListRx.match? l) }
NoOp =
nil
AuthorKeys =
['author', 'authorinitials', 'firstname', 'middlename', 'lastname', 'email']
ListContinuationMarker =
::Module.new
ListContinuationPlaceholder =
::String.new.extend ListContinuationMarker
ListContinuationString =
(::String.new LIST_CONTINUATION).extend ListContinuationMarker

Class Method Summary collapse

Methods included from Logging

#logger, #message_with_context

Class Method Details

.adjust_indentation!(lines, indent_size = 0, tab_size = 0) ⇒ Object

Remove the block indentation (the amount of whitespace of the least indented line), replace tabs with spaces (using proper tab expansion logic) and, finally, indent the lines by the margin width. Modifies the input Array directly.

This method preserves the significant indentation (that exceeding the block indent) on each line.

Examples:

source = <<EOS
    def names
      @name.split
    end
EOS
source.split ?\n
# => ["    def names", "      @names.split", "    end"]
puts (Parser.adjust_indentation! source.split ?\n).join ?\n
# => def names
# =>   @names.split
# => end
returns Nothing

Parameters:

  • lines

    The Array of String lines to process (no trailing newlines)

  • indent_size (defaults to: 0)

    The Integer number of spaces to readd to the start of non-empty lines after removing the indentation. If this value is < 0, the existing indentation is preserved (optional, default: 0)

  • tab_size (defaults to: 0)

    the Integer number of spaces to use in place of a tab. A value of <= 0 disables the replacement (optional, default: 0)



2673
2674
2675
2676
2677
2678
2679
2680
2681
2682
2683
2684
2685
2686
2687
2688
2689
2690
2691
2692
2693
2694
2695
2696
2697
2698
2699
2700
2701
2702
2703
2704
2705
2706
2707
2708
2709
2710
2711
2712
2713
2714
2715
2716
2717
2718
2719
2720
2721
2722
2723
2724
2725
2726
2727
2728
2729
2730
2731
2732
2733
2734
2735
2736
2737
2738
2739
2740
2741
2742
2743
2744
2745
2746
# File 'lib/asciidoctor/parser.rb', line 2673

def self.adjust_indentation! lines, indent_size = 0, tab_size = 0
  return if lines.empty?

  # expand tabs if a tab character is detected and tab_size > 0
  if tab_size > 0 && lines.any? {|line| line.include? TAB }
    full_tab_space = ' ' * tab_size
    lines.map! do |line|
      if line.empty? || (tab_idx = line.index TAB).nil?
        line
      else
        if tab_idx == 0
          leading_tabs = 0
          line.each_byte do |b|
            break unless b == 9
            leading_tabs += 1
          end
          line = %(#{full_tab_space * leading_tabs}#{line.slice leading_tabs, line.length})
          next line unless line.include? TAB
        end
        # keeps track of how many spaces were added to adjust offset in match data
        spaces_added = 0
        idx = 0
        result = ''
        line.each_char do |c|
          if c == TAB
            # calculate how many spaces this tab represents, then replace tab with spaces
            if (offset = idx + spaces_added) % tab_size == 0
              spaces_added += tab_size - 1
              result += full_tab_space
            else
              unless (spaces = tab_size - offset % tab_size) == 1
                spaces_added += spaces - 1
              end
              result += ' ' * spaces
            end
          else
            result += c
          end
          idx += 1
        end
        result
      end
    end
  end

  # skip block indent adjustment if indent_size is < 0
  return if indent_size < 0

  # determine block indent (assumes no whitespace-only lines are present)
  block_indent = nil
  lines.each do |line|
    next if line.empty?
    if (line_indent = line.length - line.lstrip.length) == 0
      block_indent = nil
      break
    end
    block_indent = line_indent unless block_indent && block_indent < line_indent
  end

  # remove block indent then apply indent_size if specified
  # NOTE block_indent is > 0 if not nil
  if indent_size == 0
    lines.map! {|line| line.empty? ? line : (line.slice block_indent, line.length) } if block_indent
  else
    new_block_indent = ' ' * indent_size
    if block_indent
      lines.map! {|line| line.empty? ? line : new_block_indent + (line.slice block_indent, line.length) }
    else
      lines.map! {|line| line.empty? ? line : new_block_indent + line }
    end
  end

  nil
end

.atx_section_title?(line) ⇒ Integer

Checks whether the line given is an atx section title.

The level returned is 1 less than number of leading markers.

Parameters:

  • line (String)

    candidate title with leading atx marker.

Returns:

  • (Integer)

    Returns the Integer section level if this line is an atx section title, otherwise nothing.



1708
1709
1710
1711
1712
1713
# File 'lib/asciidoctor/parser.rb', line 1708

def self.atx_section_title? line
  if Compliance.markdown_syntax ? ((line.start_with? '=', '#') && ExtAtxSectionTitleRx =~ line) :
      ((line.start_with? '=') && AtxSectionTitleRx =~ line)
    $1.length - 1
  end
end

.build_block(block_context, content_model, terminator, parent, reader, attributes, options = {}) ⇒ Object

whether a block supports compound content should be a config setting if terminator is false, that means the all the lines in the reader should be parsed NOTE could invoke filter in here, before and after parsing



1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
# File 'lib/asciidoctor/parser.rb', line 1016

def self.build_block(block_context, content_model, terminator, parent, reader, attributes, options = {})
  case content_model
  when :skip
    skip_processing, parse_as_content_model = true, :simple
  when :raw
    skip_processing, parse_as_content_model = false, :simple
  else
    skip_processing, parse_as_content_model = false, content_model
  end

  if terminator.nil?
    if parse_as_content_model == :verbatim
      lines = reader.read_lines_until break_on_blank_lines: true, break_on_list_continuation: true
    else
      content_model = :simple if content_model == :compound
      # TODO we could also skip processing if we're able to detect reader is a BlockReader
      lines = read_paragraph_lines reader, false, skip_line_comments: true, skip_processing: skip_processing
      # QUESTION check for empty lines after grabbing lines for simple content model?
    end
    block_reader = nil
  elsif parse_as_content_model != :compound
    lines = reader.read_lines_until terminator: terminator, skip_processing: skip_processing, context: block_context, cursor: :at_mark
    block_reader = nil
  # terminator is false when reader has already been prepared
  elsif terminator == false
    lines = nil
    block_reader = reader
  else
    lines = nil
    block_cursor = reader.cursor
    block_reader = Reader.new reader.read_lines_until(terminator: terminator, skip_processing: skip_processing, context: block_context, cursor: :at_mark), block_cursor
  end

  case content_model
  when :verbatim
    tab_size = (attributes['tabsize'] || parent.document.attributes['tabsize']).to_i
    if (indent = attributes['indent'])
      adjust_indentation! lines, indent.to_i, tab_size
    elsif tab_size > 0
      adjust_indentation! lines, -1, tab_size
    end
  when :skip
    # QUESTION should we still invoke process method if extension is specified?
    return
  end

  if (extension = options[:extension])
    # QUESTION do we want to delete the style?
    attributes.delete('style')
    if (block = extension.process_method[parent, block_reader || (Reader.new lines), attributes.merge]) && block != parent
      attributes.replace block.attributes
      # NOTE an extension can change the content model from :simple to :compound. It's up to the extension
      # to decide which one to use. The extension can consult the cloaked-context attribute to determine
      # if the input is a paragraph or delimited block.
      if block.content_model == :compound && Block === block && !(lines = block.lines).empty?
        content_model = :compound
        block_reader = Reader.new lines
      end
    else
      return
    end
  else
    block = Block.new(parent, block_context, content_model: content_model, source: lines, attributes: attributes)
  end

  # reader is confined within boundaries of a delimited block, so look for
  # blocks until there are no more lines
  parse_blocks block_reader, block if content_model == :compound

  block
end

.is_delimited_block?(line, return_match_data = nil) ⇒ Boolean

Determines whether this line is the start of a known delimited block.

Returns:

  • (Boolean)

    the BlockMatchData (if return_match_data is true) or true (if return_match_data is false) if this line is the start of a delimited block, otherwise nothing.



976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
# File 'lib/asciidoctor/parser.rb', line 976

def self.is_delimited_block? line, return_match_data = nil
  # highly optimized for best performance
  return unless (line_len = line.length) > 1 && DELIMITED_BLOCK_HEADS[line.slice 0, 2]
  # open block
  if line_len == 2
    tip = line
    tip_len = 2
  else
    # all other delimited blocks, including fenced code
    if line_len < 5
      tip = line
      tip_len = line_len
    else
      tip = line.slice 0, (tip_len = 4)
    end
    # special case for fenced code blocks
    if Compliance.markdown_syntax && (tip.start_with? '`')
      if tip_len == 4
        if tip == '````' || (tip = tip.chop) != '```'
          return
        end
        line = tip
        line_len = tip_len = 3
      elsif tip != '```'
        return
      end
    elsif tip_len == 3
      return
    end
  end
  # NOTE line matches the tip when delimiter is minimum length or fenced code
  context, masq = DELIMITED_BLOCKS[tip]
  if context && (line_len == tip_len || (uniform? (line.slice 1, line_len), DELIMITED_BLOCK_TAILS[tip], (line_len - 1)))
    return_match_data ? (BlockMatchData.new context, masq, tip, line) : true
  end
end

.is_section_title?(line1, line2 = nil) ⇒ Integer

Checks whether the lines given are an atx or setext section title.

Parameters:

  • line1 (String)

    candidate title.

  • line2 (String) (defaults to: nil)

    candidate underline (default: nil).

Returns:

  • (Integer)

    Returns the Integer section level if these lines are a section title, otherwise nothing.



1697
1698
1699
# File 'lib/asciidoctor/parser.rb', line 1697

def self.is_section_title?(line1, line2 = nil)
  atx_section_title?(line1) || (line2.nil_or_empty? ? nil : setext_section_title?(line1, line2))
end

.next_block(reader, parent, attributes = {}, options = {}) ⇒ Object

Parse and return the next Block at the Reader’s current location

This method begins by skipping over blank lines to find the start of the next block (paragraph, block macro, or delimited block). If a block is found, that block is parsed, initialized as a Block object, and returned. Otherwise, the method returns nothing.

Regular expressions from the Asciidoctor module are used to match block boundaries. The ensuing lines are then processed according to the content model.

Parameters:

  • reader

    The Reader from which to retrieve the next Block.

  • parent

    The Document, Section or Block to which the next Block belongs.

  • attributes (defaults to: {})

    A Hash of attributes that will become the attributes associated with the parsed Block (default: {}).

  • options (defaults to: {})

    An options Hash to control parsing (default: {}): * :text_only indicates that the parser is only looking for text content * :list_type indicates this block will be attached to a list item in a list of the specified type

Returns:

  • a Block object built from the parsed content of the processed lines, or nothing if no block is found.



503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
# File 'lib/asciidoctor/parser.rb', line 503

def self.next_block(reader, parent, attributes = {}, options = {})
  # skip ahead to the block content; bail if we've reached the end of the reader
  return unless (skipped = reader.skip_blank_lines)

  # check for option to find list item text only
  # if skipped a line, assume a list continuation was
  # used and block content is acceptable
  if (text_only = options[:text_only]) && skipped > 0
    options.delete :text_only
    text_only = nil
  end

  document = parent.document

  if options.fetch :parse_metadata, true
    # read lines until there are no more metadata lines to read; note that :text_only option impacts parsing rules
    while  reader, document, attributes, options
      # discard the line just processed
      reader.shift
      # QUESTION should we clear the attributes? no known cases when it's necessary
      reader.skip_blank_lines || return
    end
  end

  if (extensions = document.extensions)
    block_extensions, block_macro_extensions = extensions.blocks?, extensions.block_macros?
  end

  # QUESTION should we introduce a parsing context object?
  reader.mark
  this_line, doc_attrs, style = reader.read_line, document.attributes, attributes[1]
  block = block_context = cloaked_context = terminator = nil

  if (delimited_block = is_delimited_block? this_line, true)
    block_context = cloaked_context = delimited_block.context
    terminator = delimited_block.terminator
    if style
      unless style == block_context.to_s
        if delimited_block.masq.include? style
          block_context = style.to_sym
        elsif delimited_block.masq.include?('admonition') && ADMONITION_STYLES.include?(style)
          block_context = :admonition
        elsif block_extensions && extensions.registered_for_block?(style, block_context)
          block_context = style.to_sym
        else
          logger.debug message_with_context %(unknown style for #{block_context} block: #{style}), source_location: reader.cursor_at_mark if logger.debug?
          style = block_context.to_s
        end
      end
    else
      style = attributes['style'] = block_context.to_s
    end
  end

  # this loop is used for flow control; it only executes once, and only when delimited_block is not set
  # break once a block is found or at end of loop
  # returns nil if the line should be dropped
  while true
    # process lines verbatim
    if style && Compliance.strict_verbatim_paragraphs && (VERBATIM_STYLES.include? style)
      block_context = style.to_sym
      cloaked_context = :paragraph
      reader.unshift_line this_line
      # advance to block parsing =>
      break
    end

    # process lines normally
    if text_only
      indented = this_line.start_with? ' ', TAB
    else
      # NOTE move this declaration up if we need it when text_only is false
      md_syntax = Compliance.markdown_syntax
      if this_line.start_with? ' '
        indented, ch0 = true, ' '
        # QUESTION should we test line length?
        if md_syntax && this_line.lstrip.start_with?(*MARKDOWN_THEMATIC_BREAK_CHARS.keys) &&
            #!(this_line.start_with? '    ') &&
            (MarkdownThematicBreakRx.match? this_line)
          # NOTE we're letting break lines (horizontal rule, page_break, etc) have attributes
          block = Block.new(parent, :thematic_break, content_model: :empty)
          break
        end
      elsif this_line.start_with? TAB
        indented, ch0 = true, TAB
      else
        indented, ch0 = false, this_line.chr
        layout_break_chars = md_syntax ? HYBRID_LAYOUT_BREAK_CHARS : LAYOUT_BREAK_CHARS
        if (layout_break_chars.key? ch0) &&
            (md_syntax ? (ExtLayoutBreakRx.match? this_line) : (uniform? this_line, ch0, (ll = this_line.length)) && ll > 2)
          # NOTE we're letting break lines (horizontal rule, page_break, etc) have attributes
          block = Block.new(parent, layout_break_chars[ch0], content_model: :empty)
          break
        # NOTE very rare that a text-only line will end in ] (e.g., inline macro), so check that first
        elsif (this_line.end_with? ']') && (this_line.include? '::')
          #if (this_line.start_with? 'image', 'video', 'audio') && BlockMediaMacroRx =~ this_line
          if (ch0 == 'i' || (this_line.start_with? 'video:', 'audio:')) && BlockMediaMacroRx =~ this_line
            blk_ctx, target, blk_attrs = $1.to_sym, $2, $3
            block = Block.new parent, blk_ctx, content_model: :empty
            if blk_attrs
              case blk_ctx
              when :video
                posattrs = ['poster', 'width', 'height']
              when :audio
                posattrs = []
              else # :image
                posattrs = ['alt', 'width', 'height']
              end
              block.parse_attributes blk_attrs, posattrs, sub_input: true, into: attributes
            end
            # style doesn't have special meaning for media macros
            attributes.delete 'style' if attributes.key? 'style'
            if target.include? ATTR_REF_HEAD
              if (expanded_target = block.sub_attributes target).empty? &&
                  (doc_attrs['attribute-missing'] || Compliance.attribute_missing) == 'drop-line' &&
                  (block.sub_attributes target + ' ', attribute_missing: 'drop-line', drop_line_severity: :ignore).empty?
                attributes.clear
                return
              else
                target = expanded_target
              end
            end
            if blk_ctx == :image
              document.register :images, target
              attributes['imagesdir'] = doc_attrs['imagesdir']
              # NOTE style is the value of the first positional attribute in the block attribute line
              attributes['alt'] ||= style || (attributes['default-alt'] = Helpers.basename(target, true).tr('_-', ' '))
              unless (scaledwidth = attributes.delete 'scaledwidth').nil_or_empty?
                # NOTE assume % units if not specified
                attributes['scaledwidth'] = (TrailingDigitsRx.match? scaledwidth) ? %(#{scaledwidth}%) : scaledwidth
              end
              if attributes['title']
                block.title = block_title = attributes.delete 'title'
                block.assign_caption (attributes.delete 'caption'), 'figure'
              end
            end
            attributes['target'] = target
            break

          elsif ch0 == 't' && (this_line.start_with? 'toc:') && BlockTocMacroRx =~ this_line
            block = Block.new parent, :toc, content_model: :empty
            block.parse_attributes $1, [], into: attributes if $1
            break

          elsif block_macro_extensions ? (CustomBlockMacroRx =~ this_line &&
              (extension = extensions.registered_for_block_macro? $1) || (report_unknown_block_macro = logger.debug?)) :
              (logger.debug? && (report_unknown_block_macro = CustomBlockMacroRx =~ this_line))
            if report_unknown_block_macro
              logger.debug message_with_context %(unknown name for block macro: #{$1}), source_location: reader.cursor_at_mark
            else
              content = $3
              if (target = $2).include? ATTR_REF_HEAD
                if (expanded_target = parent.sub_attributes target).empty? &&
                    (doc_attrs['attribute-missing'] || Compliance.attribute_missing) == 'drop-line' &&
                    (parent.sub_attributes target + ' ', attribute_missing: 'drop-line', drop_line_severity: :ignore).empty?
                  attributes.clear
                  return
                else
                  target = expanded_target
                end
              end
              if (ext_config = extension.config)[:content_model] == :attributes
                document.parse_attributes content, ext_config[:positional_attrs] || ext_config[:pos_attrs] || [], sub_input: true, into: attributes if content
              else
                attributes['text'] = content || ''
              end
              if (default_attrs = ext_config[:default_attrs])
                attributes.update(default_attrs) {|_, old_v| old_v }
              end
              if (block = extension.process_method[parent, target, attributes]) && block != parent
                attributes.replace block.attributes
                break
              else
                attributes.clear
                return
              end
            end
          end
        end
      end
    end

    # haven't found anything yet, continue
    if !indented && (ch0 ||= this_line.chr) == '<' && CalloutListRx =~ this_line
      reader.unshift_line this_line
      block = parse_callout_list(reader, $~, parent, document.callouts)
      attributes['style'] = 'arabic'
      break

    elsif UnorderedListRx.match? this_line
      reader.unshift_line this_line
      attributes['style'] = style = 'bibliography' if !style && Section === parent && parent.sectname == 'bibliography'
      block = parse_list(reader, :ulist, parent, style)
      break

    elsif OrderedListRx.match? this_line
      reader.unshift_line this_line
      block = parse_list(reader, :olist, parent, style)
      attributes['style'] = block.style if block.style
      break

    elsif ((this_line.include? '::') || (this_line.include? ';;')) && DescriptionListRx =~ this_line
      reader.unshift_line this_line
      block = parse_description_list(reader, $~, parent)
      break

    elsif (style == 'float' || style == 'discrete') && (Compliance.underline_style_section_titles ?
        (is_section_title? this_line, reader.peek_line) : !indented && (atx_section_title? this_line))
      reader.unshift_line this_line
      float_id, float_reftext, block_title, float_level = parse_section_title reader, document, attributes['id']
      attributes['reftext'] = float_reftext if float_reftext
      block = Block.new(parent, :floating_title, content_model: :empty)
      block.title = block_title
      attributes.delete 'title'
      block.id = float_id || ((doc_attrs.key? 'sectids') ? (Section.generate_id block.title, document) : nil)
      block.level = float_level
      break

    # FIXME create another set for "passthrough" styles
    # FIXME make this more DRY!
    elsif style && style != 'normal'
      if PARAGRAPH_STYLES.include?(style)
        block_context = style.to_sym
        cloaked_context = :paragraph
        reader.unshift_line this_line
        # advance to block parsing =>
        break
      elsif ADMONITION_STYLES.include?(style)
        block_context = :admonition
        cloaked_context = :paragraph
        reader.unshift_line this_line
        # advance to block parsing =>
        break
      elsif block_extensions && extensions.registered_for_block?(style, :paragraph)
        block_context = style.to_sym
        cloaked_context = :paragraph
        reader.unshift_line this_line
        # advance to block parsing =>
        break
      else
        logger.debug message_with_context %(unknown style for paragraph: #{style}), source_location: reader.cursor_at_mark if logger.debug?
        style = nil
        # continue to process paragraph
      end
    end

    reader.unshift_line this_line

    # a literal paragraph: contiguous lines starting with at least one whitespace character
    # NOTE style can only be nil or "normal" at this point
    if indented && !style
      lines = read_paragraph_lines reader, (content_adjacent = skipped == 0 ? options[:list_type] : nil), skip_line_comments: text_only
      adjust_indentation! lines
      if text_only || content_adjacent == :dlist
        # this block gets folded into the list item text
        block = Block.new(parent, :paragraph, content_model: :simple, source: lines, attributes: attributes)
      else
        block = Block.new(parent, :literal, content_model: :verbatim, source: lines, attributes: attributes)
      end
    # a normal paragraph: contiguous non-blank/non-continuation lines (left-indented or normal style)
    else
      lines = read_paragraph_lines reader, skipped == 0 && options[:list_type], skip_line_comments: true
      # NOTE don't check indented here since it's extremely rare
      #if text_only || indented
      if text_only
        # if [normal] is used over an indented paragraph, shift content to left margin
        # QUESTION do we even need to shift since whitespace is normalized by XML in this case?
        adjust_indentation! lines if indented && style == 'normal'
        block = Block.new(parent, :paragraph, content_model: :simple, source: lines, attributes: attributes)
      elsif (ADMONITION_STYLE_HEADS.include? ch0) && (this_line.include? ':') && (AdmonitionParagraphRx =~ this_line)
        lines[0] = $' # string after match
        attributes['name'] = admonition_name = (attributes['style'] = $1).downcase
        attributes['textlabel'] = (attributes.delete 'caption') || doc_attrs[%(#{admonition_name}-caption)]
        block = Block.new(parent, :admonition, content_model: :simple, source: lines, attributes: attributes)
      elsif md_syntax && ch0 == '>' && this_line.start_with?('> ')
        lines.map! {|line| line == '>' ? (line.slice 1, line.length) : ((line.start_with? '> ') ? (line.slice 2, line.length) : line) }
        if lines[-1].start_with? '-- '
          credit_line = (credit_line = lines.pop).slice 3, credit_line.length
          unless lines.empty?
            lines.pop while lines[-1].empty?
          end
        end
        attributes['style'] = 'quote'
        # NOTE will only detect discrete (aka free-floating) headings
        # TODO could assume a discrete heading when inside a block context
        # FIXME Reader needs to be created w/ line info
        block = build_block(:quote, :compound, false, parent, Reader.new(lines), attributes)
        if credit_line
          attribution, citetitle = (block.apply_subs credit_line).split ', ', 2
          attributes['attribution'] = attribution if attribution
          attributes['citetitle'] = citetitle if citetitle
        end
      elsif ch0 == '"' && lines.size > 1 && (lines[-1].start_with? '-- ') && (lines[-2].end_with? '"')
        lines[0] = this_line.slice 1, this_line.length # strip leading quote
        credit_line = (credit_line = lines.pop).slice 3, credit_line.length
        lines.pop while lines[-1].empty?
        lines << lines.pop.chop # strip trailing quote
        attributes['style'] = 'quote'
        block = Block.new(parent, :quote, content_model: :simple, source: lines, attributes: attributes)
        attribution, citetitle = (block.apply_subs credit_line).split ', ', 2
        attributes['attribution'] = attribution if attribution
        attributes['citetitle'] = citetitle if citetitle
      else
        # if [normal] is used over an indented paragraph, shift content to left margin
        # QUESTION do we even need to shift since whitespace is normalized by XML in this case?
        adjust_indentation! lines if indented && style == 'normal'
        block = Block.new(parent, :paragraph, content_model: :simple, source: lines, attributes: attributes)
      end

      catalog_inline_anchors((lines.join LF), block, document, reader)
    end

    break # forbid loop from executing more than once
  end unless delimited_block

  # either delimited block or styled paragraph
  unless block
    case block_context
    when :listing, :source
      if block_context == :source || (language = attributes[1] ? nil : attributes[2] || doc_attrs['source-language'])
        if language # :listing
          attributes['style'] = 'source'
          attributes['language'] = language
          AttributeList.rekey attributes, [nil, nil, 'linenums']
        else # :source
          AttributeList.rekey attributes, [nil, 'language', 'linenums']
          if doc_attrs.key? 'source-language'
            attributes['language'] = doc_attrs['source-language']
          end unless attributes.key? 'language'
          attributes['cloaked-context'] = cloaked_context unless cloaked_context == :listing
        end
        if attributes['linenums-option'] || doc_attrs['source-linenums-option']
          attributes['linenums'] = ''
        end unless attributes.key? 'linenums'
        if doc_attrs.key? 'source-indent'
          attributes['indent'] = doc_attrs['source-indent']
        end unless attributes.key? 'indent'
      end
      block = build_block(:listing, :verbatim, terminator, parent, reader, attributes)
    when :fenced_code
      attributes['style'] = 'source'
      if (ll = this_line.length) > 3
        if (comma_idx = (language = this_line.slice 3, ll).index ',')
          if comma_idx > 0
            language = (language.slice 0, comma_idx).strip
            attributes['linenums'] = '' if comma_idx < ll - 4
          elsif ll > 4
            attributes['linenums'] = ''
          end
        else
          language = language.lstrip
        end
      end
      if language.nil_or_empty?
        attributes['language'] = doc_attrs['source-language'] if doc_attrs.key? 'source-language'
      else
        attributes['language'] = language
      end
      attributes['cloaked-context'] = cloaked_context
      if attributes['linenums-option'] || doc_attrs['source-linenums-option']
        attributes['linenums'] = ''
      end unless attributes.key? 'linenums'
      if doc_attrs.key? 'source-indent'
        attributes['indent'] = doc_attrs['source-indent']
      end unless attributes.key? 'indent'
      terminator = terminator.slice 0, 3
      block = build_block(:listing, :verbatim, terminator, parent, reader, attributes)
    when :table
      block_cursor = reader.cursor
      block_reader = Reader.new reader.read_lines_until(terminator: terminator, skip_line_comments: true, context: :table, cursor: :at_mark), block_cursor
      # NOTE it's very rare that format is set when using a format hint char, so short-circuit
      unless terminator.start_with? '|', '!'
        # NOTE infer dsv once all other format hint chars are ruled out
        attributes['format'] ||= (terminator.start_with? ',') ? 'csv' : 'dsv'
      end
      block = parse_table(block_reader, parent, attributes)
    when :sidebar
      block = build_block(block_context, :compound, terminator, parent, reader, attributes)
    when :admonition
      attributes['name'] = admonition_name = style.downcase
      attributes['textlabel'] = (attributes.delete 'caption') || doc_attrs[%(#{admonition_name}-caption)]
      block = build_block(block_context, :compound, terminator, parent, reader, attributes)
    when :open, :abstract, :partintro
      block = build_block(:open, :compound, terminator, parent, reader, attributes)
    when :literal
      block = build_block(block_context, :verbatim, terminator, parent, reader, attributes)
    when :example
      attributes['caption'] = '' if attributes['collapsible-option']
      block = build_block(block_context, :compound, terminator, parent, reader, attributes)
    when :quote, :verse
      AttributeList.rekey(attributes, [nil, 'attribution', 'citetitle'])
      block = build_block(block_context, (block_context == :verse ? :verbatim : :compound), terminator, parent, reader, attributes)
    when :stem, :latexmath, :asciimath
      attributes['style'] = STEM_TYPE_ALIASES[attributes[2] || doc_attrs['stem']] if block_context == :stem
      block = build_block(:stem, :raw, terminator, parent, reader, attributes)
    when :pass
      block = build_block(block_context, :raw, terminator, parent, reader, attributes)
    when :comment
      build_block(block_context, :skip, terminator, parent, reader, attributes)
      attributes.clear
      return
    else
      if block_extensions && (extension = extensions.registered_for_block? block_context, cloaked_context)
        unless (content_model = (ext_config = extension.config)[:content_model]) == :skip
          unless (positional_attrs = ext_config[:positional_attrs] || ext_config[:pos_attrs]).nil_or_empty?
            AttributeList.rekey(attributes, [nil] + positional_attrs)
          end
          if (default_attrs = ext_config[:default_attrs])
            default_attrs.each {|k, v| attributes[k] ||= v }
          end
          # QUESTION should we clone the extension for each cloaked context and set in config?
          attributes['cloaked-context'] = cloaked_context
        end
        unless (block = build_block block_context, content_model, terminator, parent, reader, attributes, extension: extension)
          attributes.clear
          return
        end
      else
        # this should only happen if there's a misconfiguration
        raise %(Unsupported block type #{block_context} at #{reader.cursor})
      end
    end
  end

  # FIXME we've got to clean this up, it's horrible!
  block.source_location = reader.cursor_at_mark if document.sourcemap
  # FIXME title and caption should be assigned when block is constructed (though we need to handle all cases)
  if attributes['title']
    block.title = block_title = attributes.delete 'title'
    block.assign_caption attributes.delete 'caption' if CAPTION_ATTRIBUTE_NAMES[block.context]
  end
  # TODO eventually remove the style attribute from the attributes hash
  #block.style = attributes.delete 'style'
  block.style = attributes['style']
  if (block_id = block.id || (block.id = attributes['id']))
    # convert title to resolve attributes while in scope
    block.title if block_title ? (block_title.include? ATTR_REF_HEAD) : block.title?
    unless document.register :refs, [block_id, block]
      logger.warn message_with_context %(id assigned to block already in use: #{block_id}), source_location: reader.cursor_at_mark
    end
  end
  # FIXME remove the need for this update!
  block.update_attributes attributes unless attributes.empty?
  block.commit_subs

  #if doc_attrs.key? :pending_attribute_entries
  #  doc_attrs.delete(:pending_attribute_entries).each do |entry|
  #    entry.save_to block.attributes
  #  end
  #end

  if block.sub? :callouts
    # No need to sub callouts if none are found when cataloging
    block.remove_sub :callouts unless catalog_callouts block.source, document
  end

  block
end

.next_section(reader, parent, attributes = {}) ⇒ Object

Return the next section from the Reader.

This method process block metadata, content and subsections for this section and returns the Section object and any orphaned attributes.

If the parent is a Document and has a header (document title), then this method will put any non-section blocks at the start of document into a preamble Block. If there are no such blocks, the preamble is dropped.

Since we are reading line-by-line, there’s a chance that metadata that should be associated with the following block gets consumed. To deal with this case, the method returns a running Hash of “orphaned” attributes that get passed to the next Section or Block.

Examples:

source
# => "= Greetings\n\nThis is my doc.\n\n== Salutations\n\nIt is awesome."
reader = Reader.new source, nil, normalize: true
# create empty document to parent the section
# and hold attributes extracted from header
doc = Document.new
Parser.next_section(reader, doc)[0].title
# => "Greetings"
Parser.next_section(reader, doc)[0].title
# => "Salutations"
returns a two-element Array containing the Section and Hash of orphaned attributes

Parameters:

  • reader

    the source Reader

  • parent

    the parent Section or Document of this new section

  • attributes (defaults to: {})

    a Hash of metadata that was left orphaned from the previous Section.



321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
# File 'lib/asciidoctor/parser.rb', line 321

def self.next_section reader, parent, attributes = {}
  preamble = intro = part = false

  # check if we are at the start of processing the document
  # NOTE we could drop a hint in the attributes to indicate
  # that we are at a section title (so we don't have to check)
  if parent.context == :document && parent.blocks.empty? && ((has_header = parent.header?) ||
      (attributes.delete 'invalid-header') || !(is_next_line_section? reader, attributes))
    book = (document = parent).doctype == 'book'
    if has_header || (book && attributes[1] != 'abstract')
      preamble = intro = Block.new parent, :preamble, content_model: :compound
      preamble.title = parent.attr 'preface-title' if book && (parent.attr? 'preface-title')
      parent.blocks << preamble
    end
    section = parent
    current_level = 0
    if parent.attributes.key? 'fragment'
      expected_next_level = -1
    # small tweak to allow subsequent level-0 sections for book doctype
    elsif book
      expected_next_level, expected_next_level_alt = 1, 0
    else
      expected_next_level = 1
    end
  else
    book = (document = parent.document).doctype == 'book'
    section = initialize_section reader, parent, attributes
    # clear attributes except for title attribute, which must be carried over to next content block
    attributes = (title = attributes['title']) ? { 'title' => title } : {}
    expected_next_level = (current_level = section.level) + 1
    if current_level == 0
      part = book
    elsif current_level == 1 && section.special
      # NOTE technically preface sections are only permitted in the book doctype
      unless (sectname = section.sectname) == 'appendix' || sectname == 'preface' || sectname == 'abstract'
        expected_next_level = nil
      end
    end
  end

  reader.skip_blank_lines

  # Parse lines belonging to this section and its subsections until we
  # reach the end of this section level
  #
  # 1. first look for metadata thingies (anchor, attribute list, block title line, etc)
  # 2. then look for a section, recurse if found
  # 3. then process blocks
  #
  # We have to parse all the metadata lines before continuing with the loop,
  # otherwise subsequent metadata lines get interpreted as block content
  while reader.has_more_lines?
     reader, document, attributes
    if (next_level = is_next_line_section?(reader, attributes))
      if document.attr? 'leveloffset'
        next_level += (document.attr 'leveloffset').to_i
        next_level = 0 if next_level < 0
      end
      if next_level > current_level
        if expected_next_level
          unless next_level == expected_next_level || (expected_next_level_alt && next_level == expected_next_level_alt) || expected_next_level < 0
            expected_condition = expected_next_level_alt ? %(expected levels #{expected_next_level_alt} or #{expected_next_level}) : %(expected level #{expected_next_level})
            logger.warn message_with_context %(section title out of sequence: #{expected_condition}, got level #{next_level}), source_location: reader.cursor
          end
        else
          logger.error message_with_context %(#{sectname} sections do not support nested sections), source_location: reader.cursor
        end
        new_section, attributes = next_section reader, section, attributes
        section.assign_numeral new_section
        section.blocks << new_section
      elsif next_level == 0 && section == document
        logger.error message_with_context 'level 0 sections can only be used when doctype is book', source_location: reader.cursor unless book
        new_section, attributes = next_section reader, section, attributes
        section.assign_numeral new_section
        section.blocks << new_section
      else
        # close this section (and break out of the nesting) to begin a new one
        break
      end
    else
      # just take one block or else we run the risk of overrunning section boundaries
      block_cursor = reader.cursor
      if (new_block = next_block reader, intro || section, attributes, parse_metadata: false)
        # REVIEW this may be doing too much
        if part
          if !section.blocks?
            # if this not a [partintro] open block, enclose it in a [partintro] open block
            if new_block.style != 'partintro'
              # if this is already a normal open block, simply add the partintro style
              if new_block.style == 'open' && new_block.context == :open
                new_block.style = 'partintro'
              else
                new_block.parent = (intro = Block.new section, :open, content_model: :compound)
                intro.style = 'partintro'
                section.blocks << intro
              end
            # if this is a [partintro] paragraph, convert it to a [partintro] open block w/ single paragraph
            elsif new_block.content_model == :simple
              new_block.content_model = :compound
              new_block << (Block.new new_block, :paragraph, source: new_block.lines, subs: new_block.subs)
              new_block.lines.clear
              new_block.subs.clear
            end
          elsif section.blocks.size == 1
            first_block = section.blocks[0]
            # open the [partintro] open block for appending
            if !intro && first_block.content_model == :compound
              logger.error message_with_context 'illegal block content outside of partintro block', source_location: block_cursor
            # rebuild [partintro] paragraph as an open block
            elsif first_block.content_model != :compound
              new_block.parent = (intro = Block.new section, :open, content_model: :compound)
              if first_block.style == (intro.style = 'partintro')
                first_block.context = :paragraph
                first_block.style = nil
              end
              section.blocks.shift
              intro << first_block
              section.blocks << intro
            end
          end
        end

        (intro || section).blocks << new_block
        attributes.clear
      end
    end

    reader.skip_blank_lines || break
  end

  if part
    unless section.blocks? && section.blocks[-1].context == :section
      logger.error message_with_context 'invalid part, must have at least one section (e.g., chapter, appendix, etc.)', source_location: reader.cursor
    end
  # NOTE we could try to avoid creating a preamble in the first place, though
  # that would require reworking assumptions in next_section since the preamble
  # is treated like an untitled section
  elsif preamble # implies parent == document
    if preamble.blocks?
      if book || document.blocks[1] || !Compliance.unwrap_standalone_preamble
        preamble.source_location = preamble.blocks[0].source_location if document.sourcemap
      # unwrap standalone preamble (i.e., document has no sections) except for books, if permissible
      else
        document.blocks.shift
        while (child_block = preamble.blocks.shift)
          document << child_block
        end
      end
    # drop the preamble if it has no content
    else
      document.blocks.shift
    end
  end

  # The attributes returned here are orphaned attributes that fall at the end
  # of a section that need to get transferred to the next section
  # see "trailing block attributes transfer to the following section" in
  # test/attributes_test.rb for an example
  [section == parent ? nil : section, attributes.merge]
end

.parse(reader, document, options = {}) ⇒ Object

Parses AsciiDoc source read from the Reader into the Document

This method is the main entry-point into the Parser when parsing a full document. It first looks for and, if found, processes the document title. It then proceeds to iterate through the lines in the Reader, parsing the document into nested Sections and Blocks.

Parameters:

  • reader

    the Reader holding the source lines of the document

  • document

    the empty Document into which the lines will be parsed

  • options (defaults to: {})

    a Hash of options to control processing

  • returns

    the Document object



97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
# File 'lib/asciidoctor/parser.rb', line 97

def self.parse(reader, document, options = {})
  block_attributes = parse_document_header(reader, document, (header_only = options[:header_only]))

  # NOTE don't use a postfix conditional here as it's known to confuse JRuby in certain circumstances
  unless header_only
    while reader.has_more_lines?
      new_section, block_attributes = next_section(reader, document, block_attributes)
      if new_section
        document.assign_numeral new_section
        document.blocks << new_section
      end
    end
  end

  document
end

.parse_blocks(reader, parent, attributes = nil) ⇒ void

This method returns an undefined value.

Parse blocks from this reader until there are no more lines.

This method calls Parser#next_block until there are no more lines in the Reader. It does not consider sections because it’s assumed the Reader only has lines which are within a delimited block region.

Parameters:

  • reader

    The Reader containing the lines to process

  • parent

    The parent Block to which to attach the parsed blocks



1098
1099
1100
1101
1102
1103
1104
1105
# File 'lib/asciidoctor/parser.rb', line 1098

def self.parse_blocks(reader, parent, attributes = nil)
  if attributes
    while ((block = next_block reader, parent, attributes.merge) && parent.blocks << block) || reader.has_more_lines?; end
  else
    while ((block = next_block reader, parent) && parent.blocks << block) || reader.has_more_lines?; end
  end
  nil
end

.parse_document_header(reader, document, header_only = false) ⇒ Object

Parses the document header of the AsciiDoc source read from the Reader

Reads the AsciiDoc source from the Reader until the end of the document header is reached. The Document object is populated with information from the header (document title, document attributes, etc). The document attributes are then saved to establish a save point to which to rollback after parsing is complete.

This method assumes that there are no blank lines at the start of the document, which are automatically removed by the reader.

returns the Hash of orphan block attributes captured above the header



126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
# File 'lib/asciidoctor/parser.rb', line 126

def self.parse_document_header(reader, document, header_only = false)
  # capture lines of block-level metadata and plow away comment lines that precede first block
  block_attrs = reader.skip_blank_lines ? ( reader, document) : {}
  doc_attrs = document.attributes

  # special case, block title is not allowed above document title,
  # carry attributes over to the document body
  if (implicit_doctitle = is_next_line_doctitle? reader, block_attrs, doc_attrs['leveloffset']) && block_attrs['title']
    doc_attrs['authorcount'] = 0
    return document.finalize_header block_attrs, false
  end

  # yep, document title logic in AsciiDoc is just insanity
  # definitely an area for spec refinement

  unless (val = doc_attrs['doctitle']).nil_or_empty?
    document.title = doctitle_attr_val = val
  end

  # if the first line is the document title, add a header to the document and parse the header metadata
  if implicit_doctitle
    source_location = reader.cursor if document.sourcemap
    document.id, _, l0_section_title, _, atx = parse_section_title reader, document
    if doctitle_attr_val
      # NOTE doctitle attribute (set above or below implicit doctitle) overrides implicit doctitle
      l0_section_title = nil
    else
      document.title = l0_section_title
      if (doc_attrs['doctitle'] = doctitle_attr_val = document.sub_specialchars l0_section_title).include? ATTR_REF_HEAD
        # QUESTION should we defer substituting attributes until the end of the header? or should we substitute again if necessary?
        doc_attrs['doctitle'] = doctitle_attr_val = document.sub_attributes doctitle_attr_val, attribute_missing: 'skip'
      end
    end
    document.header.source_location = source_location if source_location
    # default to compat-mode if document has setext doctitle
    doc_attrs['compat-mode'] = '' unless atx || (document.attribute_locked? 'compat-mode')
    if (separator = block_attrs['separator'])
      doc_attrs['title-separator'] = separator unless document.attribute_locked? 'title-separator'
    end
    if (doc_id = block_attrs['id'])
      document.id = doc_id
    else
      doc_id = document.id
    end
    if (role = block_attrs['role'])
      doc_attrs['role'] = role
    end
    if (reftext = block_attrs['reftext'])
      doc_attrs['reftext'] = reftext
    end
    block_attrs.clear
    (modified_attrs = document.instance_variable_get :@attributes_modified).delete 'doctitle'
     reader, document, nil
    if modified_attrs.include? 'doctitle'
      if (val = doc_attrs['doctitle']).nil_or_empty? || val == doctitle_attr_val
        doc_attrs['doctitle'] = doctitle_attr_val
      else
        document.title = val
      end
    elsif !l0_section_title
      modified_attrs << 'doctitle'
    end
    document.register :refs, [doc_id, document] if doc_id
  elsif (author = doc_attrs['author'])
     = process_authors author, true, false
    .delete 'authorinitials' if doc_attrs['authorinitials']
    doc_attrs.update 
  elsif (author = doc_attrs['authors'])
     = process_authors author, true
    doc_attrs.update 
  else
    doc_attrs['authorcount'] = 0
  end

  # parse title and consume name section of manpage document
  parse_manpage_header reader, document, block_attrs, header_only if document.doctype == 'manpage'

  # NOTE block_attrs are the block-level attributes (not document attributes) that
  # precede the first line of content (document title, first section or first block)
  document.finalize_header block_attrs
end

.parse_header_metadata(reader, document = nil, retrieve = true) ⇒ Object

Consume and parse the two header lines (line 1 = author info, line 2 = revision info).

Examples:

data = ["Author Name <[email protected]>\n", "v1.0, 2012-12-21: Coincide w/ end of world.\n"]
(Reader.new data, nil, normalize: true)
# => { 'author' => 'Author Name', 'firstname' => 'Author', 'lastname' => 'Name', 'email' => '[email protected]',
#       'revnumber' => '1.0', 'revdate' => '2012-12-21', 'revremark' => 'Coincide w/ end of world.' }

Parameters:

  • Returns

    the Hash of header metadata. If a Document object is supplied, the metadata

  • is

    applied directly to the attributes of the Document.

  • reader

    the Reader holding the source lines of the document

  • document (defaults to: nil)

    the Document we are building (default: nil)



1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
# File 'lib/asciidoctor/parser.rb', line 1814

def self. reader, document = nil, retrieve = true
  doc_attrs = document && document.attributes
  # NOTE this will discard any comment lines, but not skip blank lines
  process_attribute_entries reader, document

  if reader.has_more_lines? && !reader.next_line_empty?
    authorcount = ( = process_authors reader.read_line).delete 'authorcount'
    if document && (doc_attrs['authorcount'] = authorcount) > 0
      .each do |key, val|
        # apply header subs and assign to document; attributes substitution only relevant for email
        doc_attrs[key] = document.apply_header_subs val unless doc_attrs.key? key
      end
      implicit_author = doc_attrs['author']
      implicit_authorinitials = doc_attrs['authorinitials']
      implicit_authors = doc_attrs['authors']
    end
    ['authorcount'] = authorcount

    # NOTE this will discard any comment lines, but not skip blank lines
    process_attribute_entries reader, document

    if reader.has_more_lines? && !reader.next_line_empty?
      rev_line = reader.read_line
      if (match = RevisionInfoLineRx.match rev_line)
         = {}
        ['revnumber'] = match[1].rstrip if match[1]
        unless (component = match[2].strip).empty?
          # version must begin with 'v' if date is absent
          if !match[1] && (component.start_with? 'v')
            ['revnumber'] = component.slice 1, component.length
          else
            ['revdate'] = component
          end
        end
        ['revremark'] = match[3].rstrip if match[3]
        if document && !.empty?
          # apply header subs and assign to document
          .each do |key, val|
            doc_attrs[key] = document.apply_header_subs val unless doc_attrs.key? key
          end
        end
      else
        # throw it back
        reader.unshift_line rev_line
      end
    end

    # NOTE this will discard any comment lines, but not skip blank lines
    process_attribute_entries reader, document

    reader.skip_blank_lines
  else
     = {}
  end

  # process author attribute entries that override (or stand in for) the implicit author line
  if document
    if doc_attrs.key?('author') && (author_line = doc_attrs['author']) != implicit_author
      # do not allow multiple, process as names only
       = process_authors author_line, true, false
      .delete 'authorinitials' if doc_attrs['authorinitials'] != implicit_authorinitials
    elsif doc_attrs.key?('authors') && (author_line = doc_attrs['authors']) != implicit_authors
      # allow multiple, process as names only
       = process_authors author_line, true
    else
      authors, author_idx, author_key, explicit, sparse = [], 1, 'author_1', false, false
      while doc_attrs.key? author_key
        # only use indexed author attribute if value is different
        # leaves corner case if line matches with underscores converted to spaces; use double space to force
        if (author_override = doc_attrs[author_key]) == [author_key]
          authors << nil
          sparse = true
        else
          authors << author_override
          explicit = true
        end
        author_key = %(author_#{author_idx += 1})
      end
      if explicit
        # rebuild implicit author names to reparse
        authors.each_with_index do |author, idx|
          next if author
          authors[idx] = [
            [%(firstname_#{name_idx = idx + 1})],
            [%(middlename_#{name_idx})],
            [%(lastname_#{name_idx})]
          ].compact.map {|it| it.tr ' ', '_' }.join ' '
        end if sparse
        # process as names only
         = process_authors authors, true, false
      else
         = { 'authorcount' => 0 }
      end
    end

    if ['authorcount'] == 0
      if authorcount
         = nil
      else
        doc_attrs['authorcount'] = 0
      end
    else
      doc_attrs.update 

      # special case
      if !doc_attrs.key?('email') && doc_attrs.key?('email_1')
        doc_attrs['email'] = doc_attrs['email_1']
      end
    end
  end

  .merge .to_h, .to_h if retrieve
end

.parse_manpage_header(reader, document, block_attributes, header_only = false) ⇒ Object

Parses the manpage header of the AsciiDoc source read from the Reader

returns Nothing



211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
# File 'lib/asciidoctor/parser.rb', line 211

def self.parse_manpage_header(reader, document, block_attributes, header_only = false)
  if ManpageTitleVolnumRx =~ (doc_attrs = document.attributes)['doctitle']
    doc_attrs['manvolnum'] = manvolnum = $2
    doc_attrs['mantitle'] = (((mantitle = $1).include? ATTR_REF_HEAD) ? (document.sub_attributes mantitle) : mantitle).downcase
  else
    logger.error message_with_context 'non-conforming manpage title', source_location: (reader.cursor_at_line 1)
    # provide sensible fallbacks
    doc_attrs['mantitle'] = doc_attrs['doctitle'] || doc_attrs['docname'] || 'command'
    doc_attrs['manvolnum'] = manvolnum = '1'
  end
  if (manname = doc_attrs['manname']) && doc_attrs['manpurpose']
    doc_attrs['manname-title'] ||= 'Name'
    doc_attrs['mannames'] = [manname]
    if document.backend == 'manpage'
      doc_attrs['docname'] = manname
      doc_attrs['outfilesuffix'] = %(.#{manvolnum})
    end
  elsif header_only
    # done
  else
    reader.skip_blank_lines
    reader.save
    block_attributes.update  reader, document
    if (name_section_level = is_next_line_section? reader, {})
      if name_section_level == 1
        name_section = initialize_section reader, document, {}
        name_section_buffer = (reader.read_lines_until break_on_blank_lines: true, skip_line_comments: true).map {|l| l.lstrip }.join ' '
        if ManpageNamePurposeRx =~ name_section_buffer
          if (manname = $1).include? ATTR_REF_HEAD
            manname = document.sub_attributes manname
          end
          if manname.include? ','
            manname = (mannames = (manname.split ',').map {|n| n.lstrip })[0]
          else
            mannames = [manname]
          end
          if (manpurpose = $2).include? ATTR_REF_HEAD
            manpurpose = document.sub_attributes manpurpose
          end
          doc_attrs['manname-title'] ||= name_section.title
          doc_attrs['manname-id'] = name_section.id if name_section.id
          doc_attrs['manname'] = manname
          doc_attrs['mannames'] = mannames
          doc_attrs['manpurpose'] = manpurpose
          if document.backend == 'manpage'
            doc_attrs['docname'] = manname
            doc_attrs['outfilesuffix'] = %(.#{manvolnum})
          end
        else
          error_msg = 'non-conforming name section body'
        end
      else
        error_msg = 'name section must be at level 1'
      end
    else
      error_msg = 'name section expected'
    end
    if error_msg
      reader.restore_save
      logger.error message_with_context error_msg, source_location: reader.cursor
      doc_attrs['manname'] = manname = doc_attrs['docname'] || 'command'
      doc_attrs['mannames'] = [manname]
      if document.backend == 'manpage'
        doc_attrs['docname'] = manname
        doc_attrs['outfilesuffix'] = %(.#{manvolnum})
      end
    else
      reader.discard_save
    end
  end
  nil
end

.parse_style_attribute(attributes, reader = nil) ⇒ Object

Parse the first positional attribute and assign named attributes

Parse the first positional attribute to extract the style, role and id parts, assign the values to their corresponding attribute keys and return the parsed style from the first positional attribute.

Examples:

puts attributes
=> { 1 => "abstract#intro.lead%fragment", "style" => "preamble" }
parse_style_attribute(attributes)
=> "abstract"
puts attributes
=> { 1 => "abstract#intro.lead%fragment", "style" => "abstract", "id" => "intro",
      "role" => "lead", "options" => "fragment", "fragment-option" => '' }

Parameters:

  • attributes

    The Hash of attributes to process and update

Returns:

  • the String style parsed from the first positional attribute



2567
2568
2569
2570
2571
2572
2573
2574
2575
2576
2577
2578
2579
2580
2581
2582
2583
2584
2585
2586
2587
2588
2589
2590
2591
2592
2593
2594
2595
2596
2597
2598
2599
2600
2601
2602
2603
2604
2605
2606
2607
2608
2609
2610
2611
2612
2613
2614
2615
2616
# File 'lib/asciidoctor/parser.rb', line 2567

def self.parse_style_attribute attributes, reader = nil
  # NOTE spaces are not allowed in shorthand, so if we detect one, this ain't no shorthand
  if (raw_style = attributes[1]) && !raw_style.include?(' ') && Compliance.shorthand_property_syntax
    name = nil
    accum = ''
    parsed_attrs = {}

    raw_style.each_char do |c|
      case c
      when '.'
        yield_buffered_attribute parsed_attrs, name, accum, reader
        accum = ''
        name = :role
      when '#'
        yield_buffered_attribute parsed_attrs, name, accum, reader
        accum = ''
        name = :id
      when '%'
        yield_buffered_attribute parsed_attrs, name, accum, reader
        accum = ''
        name = :option
      else
        accum += c
      end
    end

    # small optimization if no shorthand is found
    if name
      yield_buffered_attribute parsed_attrs, name, accum, reader

      if (parsed_style = parsed_attrs[:style])
        attributes['style'] = parsed_style
      end

      attributes['id'] = parsed_attrs[:id] if parsed_attrs.key? :id

      if parsed_attrs.key? :role
        attributes['role'] = (existing_role = attributes['role']).nil_or_empty? ? (parsed_attrs[:role].join ' ') : %(#{existing_role} #{parsed_attrs[:role].join ' '})
      end

      parsed_attrs[:option].each {|opt| attributes[%(#{opt}-option)] = '' } if parsed_attrs.key? :option

      parsed_style
    else
      attributes['style'] = raw_style
    end
  else
    attributes['style'] = raw_style
  end
end

.process_attribute_entries(reader, document, attributes = nil) ⇒ void

This method returns an undefined value.

Process consecutive attribute entry lines, ignoring adjacent line comments and comment blocks.



2093
2094
2095
2096
2097
2098
2099
2100
# File 'lib/asciidoctor/parser.rb', line 2093

def self.process_attribute_entries reader, document, attributes = nil
  reader.skip_comment_lines
  while process_attribute_entry reader, document, attributes
    # discard line just processed
    reader.shift
    reader.skip_comment_lines
  end
end

.process_attribute_entry(reader, document, attributes = nil, match = nil) ⇒ Object



2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
# File 'lib/asciidoctor/parser.rb', line 2102

def self.process_attribute_entry reader, document, attributes = nil, match = nil
  if match || (match = reader.has_more_lines? ? (AttributeEntryRx.match reader.peek_line) : nil)
    if (value = match[2]).nil_or_empty?
      value = ''
    elsif value.end_with? LINE_CONTINUATION, LINE_CONTINUATION_LEGACY
      con, value = (value.slice value.length - 2, 2), (value.slice 0, value.length - 2).rstrip
      while reader.advance && !(next_line = reader.peek_line || '').empty?
        next_line = next_line.lstrip
        next_line = (next_line.slice 0, next_line.length - 2).rstrip if (keep_open = next_line.end_with? con)
        value = %(#{value}#{(value.end_with? HARD_LINE_BREAK) ? LF : ' '}#{next_line})
        break unless keep_open
      end
    end

    store_attribute match[1], value, document, attributes
    true
  end
end

.read_paragraph_lines(reader, break_at_list, opts = {}) ⇒ Object



962
963
964
965
966
967
968
969
970
# File 'lib/asciidoctor/parser.rb', line 962

def self.read_paragraph_lines reader, break_at_list, opts = {}
  opts[:break_on_blank_lines] = true
  opts[:break_on_list_continuation] = true
  opts[:preserve_last_line] = true
  break_condition = (break_at_list ?
      (Compliance.block_terminates_paragraph ? StartOfBlockOrListProc : StartOfListProc) :
      (Compliance.block_terminates_paragraph ? StartOfBlockProc : NoOp))
  reader.read_lines_until opts, &break_condition
end

.setext_section_title?(line1, line2) ⇒ Integer

Checks whether the lines given are an setext section title.

Parameters:

  • line1 (String)

    candidate title

  • line2 (String)

    candidate underline

Returns:

  • (Integer)

    Returns the Integer section level if these lines are an setext section title, otherwise nothing.



1721
1722
1723
1724
1725
1726
# File 'lib/asciidoctor/parser.rb', line 1721

def self.setext_section_title? line1, line2
  if (level = SETEXT_SECTION_LEVELS[line2_ch0 = line2.chr]) && (uniform? line2, line2_ch0, (line2_len = line2.length)) &&
      (SetextSectionTitleRx.match? line1) && (line1.length - line2_len).abs < 2
    level
  end
end

.store_attribute(name, value, doc = nil, attrs = nil) ⇒ Object

Store the attribute in the document and register attribute entry if accessible

Parameters:

  • name

    the String name of the attribute to store; if name begins or ends with !, it signals to remove the attribute with that root name

  • value

    the String value of the attribute to store

  • doc (defaults to: nil)

    the Document being parsed

  • attrs (defaults to: nil)

    the attributes for the current context

  • returns

    a 2-element array containing the resolved attribute name (minus the ! indicator) and value



2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
# File 'lib/asciidoctor/parser.rb', line 2130

def self.store_attribute name, value, doc = nil, attrs = nil
  # TODO move processing of attribute value to utility method
  if name.end_with? '!'
    # a nil value signals the attribute should be deleted (unset)
    name = name.chop
    value = nil
  elsif name.start_with? '!'
    # a nil value signals the attribute should be deleted (unset)
    name = (name.slice 1, name.length)
    value = nil
  end

  if (name = sanitize_attribute_name name) == 'numbered'
    name = 'sectnums'
  elsif name == 'hardbreaks'
    name = 'hardbreaks-option'
  elsif name == 'showtitle'
    store_attribute 'notitle', (value ? nil : ''), doc, attrs
  end

  if doc
    if value
      if name == 'leveloffset'
        # support relative leveloffset values
        if value.start_with? '+'
          value = ((doc.attr 'leveloffset', 0).to_i + (value.slice 1, value.length).to_i).to_s
        elsif value.start_with? '-'
          value = ((doc.attr 'leveloffset', 0).to_i - (value.slice 1, value.length).to_i).to_s
        end
      end
      # QUESTION should we set value to locked value if set_attribute returns false?
      if (resolved_value = doc.set_attribute name, value)
        value = resolved_value
        (Document::AttributeEntry.new name, value).save_to attrs if attrs
      end
    elsif (doc.delete_attribute name) && attrs
      (Document::AttributeEntry.new name, value).save_to attrs
    end
  elsif attrs
    (Document::AttributeEntry.new name, value).save_to attrs
  end

  [name, value]
end

.uniform?(str, chr, len) ⇒ Boolean

Returns:

  • (Boolean)


2748
2749
2750
# File 'lib/asciidoctor/parser.rb', line 2748

def self.uniform? str, chr, len
  (str.count chr) == len
end