Class: PrettyPrint

Inherits:
Object
  • Object
show all
Defined in:
lib/syntax_tree/prettyprint.rb

Overview

This class implements a pretty printing algorithm. It finds line breaks and nice indentations for grouped structure.

By default, the class assumes that primitive elements are strings and each byte in the strings is a single column in width. But it can be used for other situations by giving suitable arguments for some methods:

  • newline object and space generation block for PrettyPrint.new

  • optional width argument for PrettyPrint#text

  • PrettyPrint#breakable

There are several candidate uses:

  • text formatting using proportional fonts

  • multibyte characters which has columns different to number of bytes

  • non-string formatting

Usage

To use this module, you will need to generate a tree of print nodes that represent indentation and newline behavior before it gets sent to the printer. Each node has different semantics, depending on the desired output.

The most basic node is a Text node. This represents plain text content that cannot be broken up even if it doesn’t fit on one line. You would create one of those with the text method, as in:

PrettyPrint.format { |q| q.text('my content') }

No matter what the desired output width is, the output for the snippet above will always be the same.

If you want to allow the printer to break up the content on the space character when there isn’t enough width for the full string on the same line, you can use the Breakable and Group nodes. For example:

PrettyPrint.format do |q|
  q.group do
    q.text('my')
    q.breakable
    q.text('content')
  end
end

Now, if everything fits on one line (depending on the maximum width specified) then it will be the same output as the first example. If, however, there is not enough room on the line, then you will get two lines of output, one for the first string and one for the second.

There are other nodes for the print tree as well, described in the documentation below. They control alignment, indentation, conditional formatting, and more.

Bugs

  • Box based formatting?

Report any bugs at bugs.ruby-lang.org

References

Christian Lindig, Strictly Pretty, March 2000, lindig.github.io/papers/strictly-pretty-2000.pdf

Philip Wadler, A prettier printer, March 1998, homepages.inf.ed.ac.uk/wadler/papers/prettier/prettier.pdf

Author

Tanaka Akira <[email protected]>

Defined Under Namespace

Modules: Buffer Classes: Align, BreakParent, Breakable, Group, IfBreak, IfBreakBuilder, Indent, IndentLevel, LineSuffix, SingleLine, Text, Trim

Constant Summary collapse

DEFAULT_NEWLINE =

When printing, you can optionally specify the value that should be used whenever a group needs to be broken onto multiple lines. In this case the default is n.

"\n"
DEFAULT_GENSPACE =

When generating spaces after a newline for indentation, by default we generate one space per character needed for indentation. You can change this behavior (for instance to use tabs) by passing a different genspace procedure.

->(n) { " " * n }
MODE_BREAK =

There are two modes in printing, break and flat. When we’re in break mode, any lines will use their newline, any if-breaks will use their break contents, etc.

1
MODE_FLAT =

This is another print mode much like MODE_BREAK. When we’re in flat mode, we attempt to print everything on one line until we either hit a broken group, a forced line, or the maximum width.

2

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(output = "".dup, maxwidth = 80, newline = DEFAULT_NEWLINE, &genspace) ⇒ PrettyPrint

Creates a buffer for pretty printing.

output is an output target. If it is not specified, ” is assumed. It should have a << method which accepts the first argument obj of PrettyPrint#text, the first argument separator of PrettyPrint#breakable, the first argument newline of PrettyPrint.new, and the result of a given block for PrettyPrint.new.

maxwidth specifies maximum line length. If it is not specified, 80 is assumed. However actual outputs may overflow maxwidth if long non-breakable texts are provided.

newline is used for line breaks. “n” is used if it is not specified.

The block is used to generate spaces. ->(n) { ‘ ’ * n } is used if it is not given.



682
683
684
685
686
687
688
689
690
691
692
693
694
# File 'lib/syntax_tree/prettyprint.rb', line 682

def initialize(
  output = "".dup,
  maxwidth = 80,
  newline = DEFAULT_NEWLINE,
  &genspace
)
  @output = output
  @buffer = Buffer.for(output)
  @maxwidth = maxwidth
  @newline = newline
  @genspace = genspace || DEFAULT_GENSPACE
  reset
end

Instance Attribute Details

#bufferObject (readonly)

This is an output buffer that wraps the output object and provides additional functionality depending on its type.

This defaults to Buffer::StringBuffer.new(“”.dup)



641
642
643
# File 'lib/syntax_tree/prettyprint.rb', line 641

def buffer
  @buffer
end

#genspaceObject (readonly)

An object that responds to call that takes one argument, of an Integer, and returns the corresponding number of spaces.

By default this is: ->(n) { ‘ ’ * n }



657
658
659
# File 'lib/syntax_tree/prettyprint.rb', line 657

def genspace
  @genspace
end

#groupsObject (readonly)

The stack of groups that are being printed.



660
661
662
# File 'lib/syntax_tree/prettyprint.rb', line 660

def groups
  @groups
end

#maxwidthObject (readonly)

The maximum width of a line, before it is separated in to a newline

This defaults to 80, and should be an Integer



646
647
648
# File 'lib/syntax_tree/prettyprint.rb', line 646

def maxwidth
  @maxwidth
end

#newlineObject (readonly)

The value that is appended to output to add a new line.

This defaults to “n”, and should be String



651
652
653
# File 'lib/syntax_tree/prettyprint.rb', line 651

def newline
  @newline
end

#outputObject (readonly)

The output object. It represents the final destination of the contents of the print tree. It should respond to <<.

This defaults to “”.dup



635
636
637
# File 'lib/syntax_tree/prettyprint.rb', line 635

def output
  @output
end

#targetObject (readonly)

The current array of contents that calls to methods that generate print tree nodes will append to.



664
665
666
# File 'lib/syntax_tree/prettyprint.rb', line 664

def target
  @target
end

Class Method Details

.format(output = "".dup, maxwidth = 80, newline = DEFAULT_NEWLINE, genspace = DEFAULT_GENSPACE) {|q| ... } ⇒ Object

This is a convenience method which is same as follows:

begin
  q = PrettyPrint.new(output, maxwidth, newline, &genspace)
  ...
  q.flush
  output
end

Yields:

  • (q)


601
602
603
604
605
606
607
608
609
610
611
# File 'lib/syntax_tree/prettyprint.rb', line 601

def self.format(
  output = "".dup,
  maxwidth = 80,
  newline = DEFAULT_NEWLINE,
  genspace = DEFAULT_GENSPACE
)
  q = new(output, maxwidth, newline, &genspace)
  yield q
  q.flush
  output
end

.singleline_format(output = "".dup, maxwidth = nil, newline = nil, genspace = nil) {|q| ... } ⇒ Object

This is similar to PrettyPrint::format but the result has no breaks.

maxwidth, newline and genspace are ignored.

The invocation of breakable in the block doesn’t break a line and is treated as just an invocation of text.

Yields:

  • (q)


620
621
622
623
624
625
626
627
628
629
# File 'lib/syntax_tree/prettyprint.rb', line 620

def self.singleline_format(
  output = "".dup,
  maxwidth = nil,
  newline = nil,
  genspace = nil
)
  q = SingleLine.new(output)
  yield q
  output
end

Instance Method Details

#break_parentObject

This inserts a BreakParent node into the print tree which forces the surrounding and all parent group nodes to break.



901
902
903
904
905
906
907
908
909
910
911
# File 'lib/syntax_tree/prettyprint.rb', line 901

def break_parent
  doc = BreakParent.new
  target << doc

  groups.reverse_each do |group|
    break if group.break?
    group.break
  end

  doc
end

#breakable(separator = " ", width = separator.length, indent: true, force: false) ⇒ Object

This says “you can break a line here if necessary”, and a width-column text separator is inserted if a line is not broken at the point.

If separator is not specified, ‘ ’ is used.

If width is not specified, separator.length is used. You will have to specify this when separator is a multibyte character, for example.

By default, if the surrounding group is broken and a newline is inserted, the printer will indent the subsequent line up to the current level of indentation. You can disable this behavior with the indent argument if that’s not desired (rare).

By default, when you insert a Breakable into the print tree, it only breaks the surrounding group when the group’s contents cannot fit onto the remaining space of the current line. You can force it to break the surrounding group instead if you always want the newline with the force argument.



885
886
887
888
889
890
891
892
893
894
895
896
897
# File 'lib/syntax_tree/prettyprint.rb', line 885

def breakable(
  separator = " ",
  width = separator.length,
  indent: true,
  force: false
)
  doc = Breakable.new(separator, width, indent: indent, force: force)

  target << doc
  break_parent if force

  doc
end

#current_groupObject

Returns the group most recently added to the stack.

Contrived example:

out = ""
=> ""
q = PrettyPrint.new(out)
=> #<PrettyPrint:0x0>
q.group {
  q.text q.current_group.inspect
  q.text q.newline
  q.group(q.current_group.depth + 1) {
    q.text q.current_group.inspect
    q.text q.newline
    q.group(q.current_group.depth + 1) {
      q.text q.current_group.inspect
      q.text q.newline
      q.group(q.current_group.depth + 1) {
        q.text q.current_group.inspect
        q.text q.newline
      }
    }
  }
}
=> 284
 puts out
#<PrettyPrint::Group:0x0 @depth=1>
#<PrettyPrint::Group:0x0 @depth=2>
#<PrettyPrint::Group:0x0 @depth=3>
#<PrettyPrint::Group:0x0 @depth=4>


725
726
727
# File 'lib/syntax_tree/prettyprint.rb', line 725

def current_group
  groups.last
end

#fill_breakable(separator = " ", width = separator.length) ⇒ Object

This is similar to #breakable except the decision to break or not is determined individually.

Two #fill_breakable under a group may cause 4 results: (break,break), (break,non-break), (non-break,break), (non-break,non-break). This is different to #breakable because two #breakable under a group may cause 2 results: (break,break), (non-break,non-break).

The text separator is inserted if a line is not broken at this point.

If separator is not specified, ‘ ’ is used.

If width is not specified, separator.length is used. You will have to specify this when separator is a multibyte character, for example.



927
928
929
# File 'lib/syntax_tree/prettyprint.rb', line 927

def fill_breakable(separator = " ", width = separator.length)
  group { breakable(separator, width) }
end

#flushObject

Flushes all of the generated print tree onto the output buffer, then clears the generated tree from memory.



731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
# File 'lib/syntax_tree/prettyprint.rb', line 731

def flush
  # First, get the root group, since we placed one at the top to begin with.
  doc = groups.first

  # This represents how far along the current line we are. It gets reset
  # back to 0 when we encounter a newline.
  position = 0

  # This is our command stack. A command consists of a triplet of an
  # indentation level, the mode (break or flat), and a doc node.
  commands = [[IndentLevel.new(genspace: genspace), MODE_BREAK, doc]]

  # This is a small optimization boolean. It keeps track of whether or not
  # when we hit a group node we should check if it fits on the same line.
  should_remeasure = false

  # This is a separate command stack that includes the same kind of triplets
  # as the commands variable. It is used to keep track of things that should
  # go at the end of printed lines once the other doc nodes are accounted for.
  # Typically this is used to implement comments.
  line_suffixes = []

  # This is a special sort used to order the line suffixes by both the
  # priority set on the line suffix and the index it was in the original
  # array.
  line_suffix_sort = ->(line_suffix) do
    [-line_suffix.last, -line_suffixes.index(line_suffix)]
  end

  # This is a linear stack instead of a mutually recursive call defined on
  # the individual doc nodes for efficiency.
  while commands.any?
    indent, mode, doc = commands.pop

    case doc
    when Text
      doc.objects.each { |object| buffer << object }
      position += doc.width
    when Array
      doc.reverse_each { |part| commands << [indent, mode, part] }
    when Indent
      commands << [indent.indent, mode, doc.contents]
    when Align
      commands << [indent.align(doc.indent), mode, doc.contents]
    when Trim
      position -= buffer.trim!
    when Group
      if mode == MODE_FLAT && !should_remeasure
        commands <<
          [indent, doc.break? ? MODE_BREAK : MODE_FLAT, doc.contents]
      else
        should_remeasure = false
        next_cmd = [indent, MODE_FLAT, doc.contents]

        if !doc.break? && fits?(next_cmd, commands, maxwidth - position)
          commands << next_cmd
        else
          commands << [indent, MODE_BREAK, doc.contents]
        end
      end
    when IfBreak
      if mode == MODE_BREAK
        commands << [indent, mode, doc.break_contents] if doc.break_contents
      elsif mode == MODE_FLAT
        commands << [indent, mode, doc.flat_contents] if doc.flat_contents
      end
    when LineSuffix
      line_suffixes << [indent, mode, doc.contents, doc.priority]
    when Breakable
      if mode == MODE_FLAT
        if doc.force?
          # This line was forced into the output even if we were in flat mode,
          # so we need to tell the next group that no matter what, it needs to
          # remeasure because the previous measurement didn't accurately
          # capture the entire expression (this is necessary for nested
          # groups).
          should_remeasure = true
        else
          buffer << doc.separator
          position += doc.width
          next
        end
      end

      # If there are any commands in the line suffix buffer, then we're going
      # to flush them now, as we are about to add a newline.
      if line_suffixes.any?
        commands << [indent, mode, doc]
        commands += line_suffixes.sort_by(&line_suffix_sort)
        line_suffixes = []
        next
      end

      if !doc.indent?
        buffer << newline

        if indent.root
          buffer << indent.root.value
          position = indent.root.length
        else
          position = 0
        end
      else
        position -= buffer.trim!
        buffer << newline
        buffer << indent.value
        position = indent.length
      end
    when BreakParent
      # do nothing
    else
      # Special case where the user has defined some way to get an extra doc
      # node that we don't explicitly support into the list. In this case
      # we're going to assume it's 0-width and just append it to the output
      # buffer.
      #
      # This is useful behavior for putting marker nodes into the list so that
      # you can know how things are getting mapped before they get printed.
      buffer << doc
    end

    if commands.empty? && line_suffixes.any?
      commands += line_suffixes.sort_by(&line_suffix_sort)
      line_suffixes = []
    end
  end

  # Reset the group stack and target array so that this pretty printer object
  # can continue to be used before calling flush again if desired.
  reset
end

#group(indent = 0, open_object = "", close_object = "", open_width = open_object.length, close_width = close_object.length) ⇒ Object

Groups line break hints added in the block. The line break hints are all to be used or not.

If indent is specified, the method call is regarded as nested by nest(indent) { … }.

If open_object is specified, text(open_object, open_width) is called before grouping. If close_object is specified, text(close_object, close_width) is called after grouping.



955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
# File 'lib/syntax_tree/prettyprint.rb', line 955

def group(
  indent = 0,
  open_object = "",
  close_object = "",
  open_width = open_object.length,
  close_width = close_object.length
)
  text(open_object, open_width) if open_object != ""

  doc = Group.new(groups.last.depth + 1)
  groups << doc
  target << doc

  with_target(doc.contents) do
    if indent != 0
      nest(indent) { yield }
    else
      yield
    end
  end

  groups.pop
  text(close_object, close_width) if close_object != ""

  doc
end

#if_breakObject

Inserts an IfBreak node with the contents of the block being added to its list of nodes that should be printed if the surrounding node breaks. If it doesn’t, then you can specify the contents to be printed with the #if_flat method used on the return object from this method. For example,

q.if_break { q.text('do') }.if_flat { q.text('{') }

In the example above, if the surrounding group is broken it will print ‘do’ and if it is not it will print ‘{’.



1006
1007
1008
1009
1010
1011
1012
# File 'lib/syntax_tree/prettyprint.rb', line 1006

def if_break
  doc = IfBreak.new
  target << doc

  with_target(doc.break_contents) { yield }
  IfBreakBuilder.new(self, doc)
end

#indentObject

Very similar to the #nest method, this indents the nested content by one level by inserting an Indent node into the print tree. The contents of the node are determined by the block.



1017
1018
1019
1020
1021
1022
1023
# File 'lib/syntax_tree/prettyprint.rb', line 1017

def indent
  doc = Indent.new
  target << doc

  with_target(doc.contents) { yield }
  doc
end

#line_suffix(priority: LineSuffix::DEFAULT_PRIORITY) ⇒ Object

Inserts a LineSuffix node into the print tree. The contents of the node are determined by the block.



1027
1028
1029
1030
1031
1032
1033
# File 'lib/syntax_tree/prettyprint.rb', line 1027

def line_suffix(priority: LineSuffix::DEFAULT_PRIORITY)
  doc = LineSuffix.new(priority: priority)
  target << doc

  with_target(doc.contents) { yield }
  doc
end

#nest(indent) ⇒ Object

Increases left margin after newline with indent for line breaks added in the block.



1037
1038
1039
1040
1041
1042
1043
# File 'lib/syntax_tree/prettyprint.rb', line 1037

def nest(indent)
  doc = Align.new(indent: indent)
  target << doc

  with_target(doc.contents) { yield }
  doc
end

#text(object = "", width = object.length) ⇒ Object

This adds object as a text of width columns in width.

If width is not specified, object.length is used.



1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
# File 'lib/syntax_tree/prettyprint.rb', line 1048

def text(object = "", width = object.length)
  doc = target.last

  unless Text === doc
    doc = Text.new
    target << doc
  end

  doc.add(object: object, width: width)
  doc
end

#trimObject

This inserts a Trim node into the print tree which, when printed, will clear all whitespace at the end of the output buffer. This is useful for the rare case where you need to delete printed indentation and force the next node to start at the beginning of the line.



935
936
937
938
939
940
# File 'lib/syntax_tree/prettyprint.rb', line 935

def trim
  doc = Trim.new
  target << doc

  doc
end

#with_target(target) ⇒ Object

A convenience method used by a lot of the print tree node builders that temporarily changes the target that the builders will append to.



1066
1067
1068
1069
1070
# File 'lib/syntax_tree/prettyprint.rb', line 1066

def with_target(target)
  previous_target, @target = @target, target
  yield
  @target = previous_target
end