prettypr
A generic pretty printer library.
A generic pretty printer library. This module uses a strict-style context passing implementation of John Hughes algorithm, described in "The design of a Pretty-printing Library". The paragraph-style formatting, empty documents, floating documents, and null strings are my own additions to the algorithm.
To get started, you should read about the document() data type; the main constructor functions: text/1, above/2, beside/2, nest/2, sep/1, and par/2; and the main layout function format/3.
If you simply want to format a paragraph of plain text, you probably want to use the text_par/2 function, as in the following example:
prettypr:format(prettypr:text_par("Lorem ipsum dolor sit amet"), 20)
Types
document() = null
| #text{s = undefined | deep_string()}
| #nest{n = undefined | integer(),
d = undefined | document()}
| #beside{d1 = undefined | document(),
d2 = undefined | document()}
| #above{d1 = undefined | document(),
d2 = undefined | document()}
| #sep{ds = undefined | [document()],
i = integer(),
p = boolean()}
| #float{d = undefined | document(),
h = undefined | integer(),
v = undefined | integer()}
| #union{d1 = undefined | document(),
d2 = undefined | document()}
| #fit{d = undefined | document()}
Functions
text(Characters::string()) -> document()
Yields a document representing a fixed, unbreakable sequence of
characters. The string should contain only printable
characters (tabs allowed but not recommended), and not
newline, line feed, vertical tab, etc. A tab character (\t
) is
interpreted as padding of 1-8 space characters to the next column of
8 characters within the string.
See also: empty/0, null_text/1, text_par/2.
null_text(Characters::string()) -> document()
Similar to text/1, but the result is treated as having zero width. This is regardless of the actual length of the string. Null text is typically used for markup, which is supposed to have no effect on the actual layout.
The standard example is when formatting source code as HTML to be
placed within <pre>...</pre>
markup, and using e.g. <i>
and <b>
to make parts of the source code stand out. In this case, the markup
does not add to the width of the text when viewed in an HTML browser,
so the layout engine should simply pretend that the markup has zero
width.
text_par(Text::string()) -> document()
Equivalent to text_par(Text, 0).
text_par(Text::string(), Indentation::integer()) -> document()
Yields a document representing paragraph-formatted plain text.
The optional Indentation
parameter specifies the extra indentation
of the first line of the paragraph. For example, text_par("Lorem
ipsum dolor sit amet", N)
could represent
Lorem ipsum dolor sit amet
if N
= 0, or
Lorem ipsum dolor sit amet
if N
= 2, or
Lorem ipsum dolor sit amet
if N
= -2.
(The sign of the indentation is thus reversed compared to the par/2 function, and the behaviour varies slightly depending on the sign in order to match the expected layout of a paragraph of text.)
Note that this is just a utility function, which does all the work of splitting the given string into words separated by whitespace and setting up a par with the proper indentation, containing a list of text elements.
See also: par/2, text/1, text_par/1.
empty() -> document()
Yields the empty document, which has neither height nor width.
(empty
is thus different from an empty text
string, which has zero width but height 1.)
Empty documents are occasionally useful; in particular, they have the
property that above(X, empty())
will force a new line after X
without leaving an empty line below it; since this is a common idiom,
the utility function break/1 will place a given document in
such a context.
See also: text/1.
break(D::document()) -> document()
Forces a line break at the end of the given document. This is a utility function; see empty/0 for details.
nest(N::integer(), D::document()) -> document()
Indents a document a number of character positions to the right.
Note that N
may be negative, shifting the text to the left, or
zero, in which case D
is returned unchanged.
beside(D1::document(), D2::document()) -> document()
Concatenates documents horizontally. Returns a document
representing the concatenation of the documents D1
and D2
such
that the last character of D1
is horizontally adjacent to the first
character of D2
, in all possible layouts. (Note: any indentation of
D2
is lost.)
Examples:
ab cd => abcd ab ef ab cd gh => cdef gh
above(D1::document(), D2::document()) -> document()
Concatenates documents vertically. Returns a document
representing the concatenation of the documents D1
and D2
such
that the first line of D2
follows directly below the last line of
D1
, and the first character of D2
is in the same horizontal
column as the first character of D1
, in all possible layouts.
Examples:
ab cd => ab cd abc abc fgh => de de ij fgh ij
sep(Docs::[document()]) -> document()
Arranges documents horizontally or vertically, separated by
whitespace. Returns a document representing two alternative layouts
of the (nonempty) sequence Docs
of documents, such that either all
elements in Docs
are concatenated horizontally, and separated by a
space character, or all elements are concatenated vertically (without
extra separation).
Note: If some document in Docs
contains a line break, the vertical
layout will always be selected.
Examples:
ab ab cd ef => ab cd ef | cd ef ab ab cd ef => cd ef
See also: par/2.
par(Docs::[document()]) -> document()
Equivalent to par(Ds, 0).
par(Docs::[document()], Offset::integer()) -> document()
Arranges documents in a paragraph-like layout. Returns a
document representing all possible left-aligned paragraph-like
layouts of the (nonempty) sequence Docs
of documents. Elements in
Docs
are separated horizontally by a single space character and
vertically with a single line break. All lines following the first
(if any) are indented to the same left column, whose indentation is
specified by the optional Offset
parameter relative to the position
of the first element in Docs
. For example, with an offset of -4,
the following layout can be produced, for a list of documents
representing the numbers 0 to 15:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
or with an offset of +2:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
The utility function text_par/2 can be used to easily
transform a string of text into a par
representation by splitting
it into words.
Note that whenever a document in Docs
contains a line break, it
will be placed on a separate line. Thus, neither a layout such as
ab cd ef
nor
ab cd ef
will be generated. However, a useful idiom for making the former
variant possible (when wanted) is beside(par([D1, text("")], N),
D2)
for two documents D1
and D2
. This will break the line
between D1
and D2
if D1
contains a line break (or if otherwise
necessary), and optionally further indent D2
by N
character
positions. The utility function follow/3 creates this context
for two documents D1
and D2
, and an optional integer N
.
See also: par/1, text_par/2.
follow(D1::document(), D2::document()) -> document()
Equivalent to follow(D1, D2, 0).
follow(D1::document(), D2::document(), Offset::integer()) -> document()
Separates two documents by either a single space, or a line break and intentation. In other words, one of the layouts
abc def
or
abc def
will be generated, using the optional offset in the latter case. This is often useful for typesetting programming language constructs.
This is a utility function; see par/2 for further details.
See also: follow/2.
floating(D::document()) -> document()
Equivalent to floating(D, 0, 0).
floating(D::document(), Hp::integer(), Vp::integer()) -> document()
Creates a "floating" document. The result represents the same
set of layouts as D
; however, a floating document may be moved
relative to other floating documents immediately beside or above it,
according to their relative horizontal and vertical priorities. These
priorities are set with the Hp
and Vp
parameters; if omitted,
both default to zero.
Notes: Floating documents appear to work well, but are currently less general than you might wish, losing effect when embedded in certain contexts. It is possible to nest floating-operators (even with different priorities), but the effects may be difficult to predict. In any case, note that the way the algorithm reorders floating documents amounts to a "bubblesort", so don't expect it to be able to sort large sequences of floating documents quickly.
format(D::document()) -> string()
Equivalent to format(D, 80).
format(D::document(), PaperWidth::integer()) -> string()
Equivalent to format(D, PaperWidth, 65).
format(D::document(), PaperWidth::integer(), LineWidth::integer()) -> string()
Computes a layout for a document and returns the corresponding
text. See document() for further information. Throws
no_layout
if no layout could be selected.
PaperWidth
specifies the total width (in character positions) of
the field for which the text is to be laid out. LineWidth
specifies
the desired maximum width (in number of characters) of the text
printed on any single line, disregarding leading and trailing white
space. These parameters need to be properly balanced in order to
produce good layouts. By default, PaperWidth
is 80 and LineWidth
is 65.
See also: best/3.
best(D::document(), PaperWidth::integer(), LineWidth::integer()) -> empty | document()
Selects a "best" layout for a document, creating a corresponding
fixed-layout document. If no layout could be produced, the atom
empty
is returned instead. For details about PaperWidth
and
LineWidth
, see format/3. The function is idempotent.
One possible use of this function is to compute a fixed layout for a document, which can then be included as part of a larger document. For example:
above(text("Example:"), nest(8, best(D, W - 12, L - 6)))
will format D
as a displayed-text example indented by 8, whose
right margin is indented by 4 relative to the paper width W
of the
surrounding document, and whose maximum individual line length is
shorter by 6 than the line length L
of the surrounding document.
This function is used by the format/3 function to prepare a document before being laid out as text.