foldl in terms of foldr: the magic of composition

(last updated: )

In Haskell’s \(\mathrm{Foldable}\) class, we are only required to give a definition of \(\mathrm{foldMap}\) or \(\mathrm{foldr}\): each of them can be expressed in terms of each other, and more surprisingly, for left-to-right data structures1, \(\mathrm{foldl}\) (and \(\mathrm{foldl'}\) and friends) can also be expressed in terms of \(\mathrm{foldr}\) or \(\mathrm{foldMap}\): a right fold somehow ‘magically’ becomes a left fold.2

There are already a few resources explaining how this ‘magic’ happens, but it still seems puzzling to most (at least from what I infer), and I want to give my own alternative (though slightly similar) explanation, that helped me finally partake in this sorcery also.

The problem

Suppose we have some binary operation \((\ast)\)3, along with a 3-element list \([x, y, z]\) of type \([A]\). Suppose we have a provided ‘neutral’ element of type \(B\), that we’ll call \(0\) (though it might not necessarily be a number, nor the ‘identity’ element).

If we use \(\mathrm{foldr}\) with these parameters, the expansion becomes \[ \begin{align*} \mathrm{foldr} \; (\ast) \; 0 \; [x, y, z] &= x \ast \left( \mathrm{foldr} \; (\ast) \; 0 \; [y, z] \right) \\ &= x \ast \left( y \ast \left( \mathrm{foldr} \; (\ast) \; 0 \; [z] \right) \right) \\ &= x \ast \left( y \ast \left( z \ast \left( \mathrm{foldr} \; (\ast) \; 0 \; [\,] \right) \right) \right) \\ &= x \ast \left( y \ast \left( z \ast 0 \right) \right) \end{align*} \]

and by the type of \(\mathrm{foldr}\), our binary operation must have type \(A \to B \to B\).

Conversely, if we use \(\mathrm{foldl}\), we deduce that

\[ \begin{align*} \mathrm{foldl} \; (\ast) \; 0 \; [x, y, z] &= \mathrm{foldl} \; (\ast) \; \left( 0 \ast x \right) \; [y, z] \\ &= \mathrm{foldl} \; (\ast) \; \left( \left(0 \ast x\right) \ast y \right) \; [z] \\ &= \mathrm{foldl} \; (\ast) \; \left( \left(\left(0 \ast x\right) \ast y\right) \ast z \right) \; [\,] \\ &= \left(\left(0 \ast x\right) \ast y\right) \ast z \end{align*} \]

and by the type of \(\mathrm{foldl}\), our binary operation must have type \(B \to A \to B\).

We do not require \((\ast)\) to be associative: as we see above, the type of its first and second argument don’t even need to be the same, and which one is \(A\) and which one is \(B\) changes depending on whether we plan to use \(\mathrm{foldr}\) or \(\mathrm{foldl}\)!

Our problem is this: given that \((\ast)\) is not associative (and doesn’t even need to have a symmetric type), can we use \(\mathrm{foldr}\), in some way, to get the expansion of \(\mathrm{foldl}\), using the binary operation \((\ast)\)?

The magic of composition

Consider the following, seemingly unrelated expression \[ (x \; \ast) \circ (y \; \ast) \circ (z \; \ast) \] (a composition of left sections). What does this expression represent? If we mechanically eta-expand it out, we see that \[ \begin{align*} (x \; \ast) \circ (y \; \ast) \circ (z \; \ast) &= \lambda w \to \; \left((x \; \ast) \circ \left((y \; \ast) \circ (z \; \ast)\right)\right) \; w \\ &= \lambda w \to \; \left(x \; \ast\right) \; \left(\left((y \; \ast) \circ (z \; \ast)\right) \; w\right) \\ &= \lambda w \to \; x \ast \left(\left((y \; \ast) \circ (z \; \ast)\right) \; w\right) \\ &= \lambda w \to \; x \ast \left((y \; \ast) \left( (z \; \ast) \; w \right)\right) \\ &= \lambda w \to \; x \ast \left(y \ast \left( (z \; \ast) \; w \right)\right) \\ &= \lambda w \to \; x \ast \left(y \ast \left(z \ast w \right)\right) \text{.} \\ \end{align*} \]

Hence, if we apply this expression to \(0\), we get \[ (x \; \ast) \circ (y \; \ast) \circ (z \; \ast) \; \$ \; 0 = x \ast \left(y \ast \left(z \ast 0\right)\right) \text{.} \] This is the same as our expansion of \(\mathrm{foldr}\) earlier. Note also that reduction begins with the leftmost function; this means that the laziness of the original \(\mathrm{foldr}\) expression is retained: if \((\ast)\) is lazy in its second argument, then the reduction can terminate early, without consuming excessive space.

Conversely, consider this expression \[ (\ast \; z) \circ (\ast \; y) \circ (\ast \; x) \text{.} \]

(which is a composition of right sections, and the elements of the list appear in reverse order). By eta-expanding, we deduce \[ \begin{align*} (\ast \; z) \circ (\ast \; y) \circ (\ast \; x) &= \lambda w \to \; \left( (\ast \; z) \circ \left((\ast \; y) \circ (\ast \; x)\right) \right) \; w \\ %f &= \lambda w \to \; \left( \ast \; z \right) \left( \left((\ast \; y) \circ (\ast \; x)\right) \; w \right)\\ &= \lambda w \to \; \left( \left((\ast \; y) \circ (\ast \; x)\right) \; w \right) \ast z\\ &= \lambda w \to \; \left( \left(\ast \; y\right) \left( \left(\ast \; x\right) \; w \right) \right) \ast z \\ &= \lambda w \to \; \left( \left( \left(\ast \; x\right) \; w \right) \ast y \right) \ast z \\ &= \lambda w \to \; \left( \left(w \ast x \right) \ast y \right) \ast z \text{.} \end{align*} \]

Hence, if we apply this expression to \(0\), we get \[ (\ast \; z) \circ (\ast \; y) \circ (\ast \; x) \; \$ \; 0 = \; \left( \left(0 \ast x\right) \ast y \right) \ast z \] which is the same as our \(\mathrm{foldl}\) expansion earlier.4

So instead of using the binary operation \((\ast)\) directly, we can write the expansions using composition instead, along with an application at the very end. And we know that composition is associative: if we have a series of compositions, it won’t matter if they’re right-associated or left-associated.

This suggests a \(\mathrm{foldr}\) solution using composition: we want to obtain the expression \((\ast \; z) \circ (\ast \; y) \circ (\ast \; x)\). If we look at the expansion of foldr from earlier, it might not seem immediately obvious how to get the desired order, since the first element of the list always appears leftmost; but that is only an appearance:

\[ \begin{align*} \mathrm{foldr} \; \left(\mathrm{flip} \; (\diamond)\right) \; 0 \; [x, y, z] &= \left( \mathrm{foldr} \; \left(\mathrm{flip} \; (\diamond) \right) \; 0 \; [y, z] \right) \diamond x \\ &= \left( \left( \mathrm{foldr} \; \left(\mathrm{flip} \; (\diamond) \right) \; 0 \; [z] \right) \diamond y \right) \diamond x \\ &= \left( \left( \left( \mathrm{foldr} \; \left(\mathrm{flip} \; (\diamond) \right) \; 0 \; [\,] \right) \diamond z \right) \diamond y \right) \diamond x \\ &= \left( \left( 0 \diamond z \right) \diamond y \right) \diamond x \end{align*} \]

By flipping the operator (for some \((\diamond)\)), the first element of the list now appears rightmost. We haven’t magically turned this into a left fold (yet): the order \(0, z, y, x\) is still different to the \(0, x, y, z\) from \(\mathrm{foldl}\). But ‘reversing’ the order \(x, y, z, 0\) of \(\mathrm{foldr}\) to \(0, z, y, x\) is very easy: it doesn’t matter if our operator isn’t commutative, we can just \(\mathrm{flip}\) it.

But instead of flipping the operator with \(\mathrm{flip}\), we can write a lambda abstraction to do the same thing:

\[ \mathrm{foldr} \; \left(\lambda \, a \; b \to b \diamond a \right) \; 0 \; [x, y, z] = \left( \left( 0 \diamond z \right) \diamond y \right) \diamond x \text{.} \]

Here, \(a\) represents ‘each element of the list’, while \(b\) represents ‘the rest of the fold’. And of course, we don’t want to use \((\diamond)\); we want to use composition, combined with \((\ast)\):

\[ \begin{align*} \mathrm{foldr} \; \left(\lambda \, a \; b \to b \circ \left( \ast \; a \right) \right) \; (???) \; [x, y, z] &= \left( \left( (???) \circ \left(\ast z\right) \right) \circ \left(\ast y\right) \right) \circ \left(\ast x\right) \\ &= (???) \circ \left(\ast z\right) \circ \left(\ast y\right) \circ \left(\ast x\right) \end{align*} \]

and we can get rid of superfluous parentheses, since we know composition is associative. But what should we use for \((???)\)? That’s easy: the identity element of composition is the identity function, \(\mathrm{id}\):

\[ \begin{align*} \mathrm{foldr} \; \left(\lambda \, a \; b \to b \circ \left( \ast \; a \right) \right) \; \mathrm{id} \; [x, y, z] &= \mathrm{id} \circ \left(\ast z\right) \circ \left(\ast y\right) \circ \left(\ast x\right) \\ &= \left(\ast z\right) \circ \left(\ast y\right) \circ \left(\ast x\right) \text{.} \end{align*} \]

This is a composition, so we now have a function: all we need to do is apply an argument to get the result. We saw earlier that the correct argument to apply is \(0\):

\[ \begin{align*} \mathrm{foldr} \; \left(\lambda \, a \; b \to b \circ \left( \ast \; a \right) \right) \; \mathrm{id} \; [x, y, z] \; 0 &= \left(\ast z\right) \circ \left(\ast y\right) \circ \left(\ast x\right) \; \$ \; 0 \\ &= \; \left( \left(0 \ast x\right) \ast y \right) \ast z \text{.} \end{align*} \] And now, it’s the same as foldl. Tada!


Often, in other sources, the folding function is eta-expanded, and written pointfully, as in \[ \mathrm{foldr} \; \left(\lambda \, a \; b \; w \to b \left(w \ast a \right) \right) \; \mathrm{id} \; [x, y, z] \; 0 \text{,} \] or if \((\ast)\) is written as a function \(f\) instead of an operator, it is written \[ \mathrm{foldr} \; \left(\lambda \, a \; b \; w \to b \left(f \; w \; a \right) \right) \; \mathrm{id} \; [x, y, z] \; 0 \text{.} \] But I believe this is a case where the solution is easier to understand when written (partially) pointfree, rather than pointful: you can view composition as the folding operator being used, rather than seeing an ‘extra argument’ \(w\), seemingly coming out of nowhere and having a non-obvious purpose (by expanding this out5, we learn that it plays the role of the ‘new accumulator’, becoming \(0\), and then \(f \; 0 \; x\), and then \(f \; \left(f \; 0 \; x \right) \; y\), and then \(f \; \left(f \; \left(f \; 0 \; x\right) \; y\right)\; z\), but we never really needed to worry about this directly if we used composition: we just know it from the expansion). The function application also ends up obscuring the roles of \(a\) and \(b\) in the fold (\(b\) is to be composed with \(\left(\ast \; a\right)\)), and the associativity of the operation. I find it far easier to understand this fold as a series of compositions, rather than a sequence of applications.

But there is one advantage of the pointful definition: if we want to try apply ‘the rest of the fold’ \(b\) strictly to \(f \; w \; a\), as in \(\mathrm{foldl'}\), then an ‘obvious’ solution comes to mind, when stated that way: \[ \mathrm{foldr} \; \left(\lambda \, a \; b \; w \to b \; $! \; f \; w \; a \right) \; \mathrm{id} \; [x, y, z] \; 0 \text{.} \] However, in the (current) implementation of \(\mathrm{foldl'}\) in base, we see the definition \[ \mathrm{foldr} \; \left(\lambda \, a \; b \; !w \to b \; \left(f \; w \; a\right) \right) \; \mathrm{id} \; [x, y, z] \; 0 \text{.} \] The difference is very subtle6: in the former definition, we might not force \(0\) directly (we compute \(f \; 0 \; x\) at the start, which might force \(0\) if \(f\) is strict, and if the list was empty, we return \(0\)), but in the latter definition, \(0\) is always forced, even if the list is non-empty.

If we had a strict composition operator in the Prelude, then we could write the former definition more cleanly as \[ \mathrm{foldr} \; \left(\lambda \, a \; b \to b \; \circ! \; \left(\ast \; a\right) \right) \; \mathrm{id} \; [x, y, z] \; 0 \] where \[ \left(g \; \circ! \; f\right) = \lambda \, x \to g \; $! \; f \; x \text{.} \]

For the rest of this blogpost, we’ll use this definition of \(\mathrm{foldl'}\), even though it’s very slightly different from the standard one, since the strict composition operator \((\circ!)\) is more convenient to work with. If you insist, you may even add a \(\mathrm{seq} \; 0 \; \dotsc\) to obtain the standard behaviour.

The power of \(\mathrm{Endo}\)

If we’re using \((\ast)\) in a left fold, then the type of \((\ast)\) is \(B \to A \to B\). The right section \((\ast \; x)\) (which is equivalent to \(\mathrm{flip} \; (\ast) \; x\)) therefore has the type \(B \to B\): it must have this symmetric type, since we’re composing multiple right sections together, and the codomain of one function must match the domain of the other to be composed.

A function whose domain is the same as its codomain is called an endofunction, and more generally, a morphism whose source is the same as its target is called an endomorphism. (For now, we will only care about when morphisms are functions, but we’ll use the terms endofunction and endomorphism interchangeably.)

Endofunctions can be represented in Haskell by the \(\mathrm{Endo}\) newtype: an \(\mathrm{Endo} \; a\) simply wraps a function of type \(a \to a\). Additionally, \(\mathrm{Endo} \; a\) is a \(\mathrm{Monoid}\): the binary operation is (wrapped) composition, \[ \mathrm{Endo} \; g \diamond \mathrm{Endo} \; f = \mathrm{Endo} \; \left(g \circ f\right)\text{,} \] and the identity element is the (wrapped) identity function, \[ \mathrm{mempty} = \mathrm{Endo} \; \mathrm{id} \text{.} \]

This is why when looking at the (current) actual definition of \(\mathrm{foldl}\) in base, you will see mentions of \(\mathrm{Endo}\) and \(\mathrm{appEndo}\), and \(\mathrm{foldMap}\) is used instead, which combines monoidal results right-associatively.

But we’re not quite done just using \(\mathrm{Endo}\), since \(\mathrm{foldMap}\) will produce the compositions left-to-right, as in \[ (\ast \; x) \circ (\ast \; y) \circ (\ast \; z) \text{,} \] but we want \[ (\ast \; z) \circ (\ast \; y) \circ (\ast \; x) \text{.} \] This is an easy fix, using yet another newtype: \(\mathrm{Dual} \; m\) wraps a monoid \(m\), and simply swaps the order of \((\diamond)\): \[ \mathrm{Dual} \; a \diamond \mathrm{Dual} \; b = \mathrm{Dual} \; \left(b \diamond a\right) \text{.} \] With a bit of newtype wrapping/unwrapping noise, we can arrive at an equivalent definition of \(\mathrm{foldl}\), \[ \left(\$ \; 0\right) \circ \mathrm{appEndo} \circ \mathrm{getDual} \; \$ \; \mathrm{foldMap} \; \left(\lambda \, a \to \mathrm{Dual} \; \left(\mathrm{Endo} \; \left(\ast \; a\right) \right) \right) \; [x, y, z] \text{.} \] The power of \(\mathrm{Endo}\) (and endomorphisms in general) is that they form a monoid: the associativity of composition is what lets us not care that we’re using \(\mathrm{foldr}\) instead of \(\mathrm{foldl}\), since we can represent ‘the way \((\ast)\) is meant to be associated’ by the order of composition of the endomorphisms (i.e. whether we use \(\mathrm{Dual}\) or not), rather than the association of the parentheses.

Other folds: not just function composition

Besides being a trick to avoid having to define \(\mathrm{foldl}\) manually, we can still use \(\mathrm{foldr}\) with endomorphisms to represent other kinds of folds.

A strict, short circuiting left fold

As an example, suppose we want to compute the product of a very large list of (real) numbers. Then if any of the numbers are \(0\) (where \(0\) really means zero this time), we know the result will be \(0\), so we don’t need to bother multiplying the rest of the list.

The standard product function is defined to be a strict left fold (indirectly, using \(\mathrm{foldMap'}\); the behaviour would be exactly the same using \(\mathrm{foldl'}\)), but it’s not short circuiting: the expression \(\mathrm{product} \; \left(\mathrm{repeat} \; 0\right)\) would fail to terminate, even though we know the product of an infinite list of zeroes (or any list containing zero) should be zero.

We still want to use a strict left fold, since multiplication (for the number types defined in base) is strict, and we don’t want to leak space, so we can’t just use \(\mathrm{foldr}\) in the ‘obvious’ way. But if we use our newly found discovery, that \[ \mathrm{foldl'} \; (\ast) \; w \; xs = \mathrm{foldr} \; \left(\lambda \, a \; b \to b \; \circ! \; \left(\ast\; a\right) \right) \; \mathrm{id} \; xs \; w\text{,} \] then we can deduce \[ \begin{align*} \mathrm{product} \; xs &= \mathrm{foldl'} \; (\ast) \; 1 \; xs \\ &= \mathrm{foldr} \; \left(\lambda \, a \; b \to b \; \circ! \; \left(\ast\; a\right) \right) \; \mathrm{id} \; xs \; 1 \text{.} \end{align*} \] (Recall that \(a\) represents each element of the list, and \(b\) represents the rest of the fold: a function, consisting of a composition of functions.) Since we have access to each element of the list, \(a\), we can check if that’s \(0\), and then act accordingly: \[ \mathrm{scProduct} \; xs = \mathrm{foldr} \; \left(\lambda \, a \; b \to \mathbf{if} \; a \mathop{==} 0 \; \mathbf{then} \; \mathrm{const} \; 0 \; \mathbf{else} \; b \; \circ! \; \left(\ast\; a\right) \right) \; \mathrm{id} \; xs \; 1 \text{.} \] In \(\mathrm{scProduct}\), if we find that any of the elements of the list are \(0\), then we ignore the rest of the fold (we discard \(b\)), and just return \(\mathrm{const} \; 0\). Thanks to the laziness of \(\mathrm{foldr}\), we don’t keep recurring on the rest of the list, since we’re ignoring it, and we just return the \(\mathrm{const} \; 0\) function, applied to \(1\), which will result in \(0\), as required.

Monadic folds

The \(\mathrm{foldlM}\) function represents a left-to-right monadic fold: if you’re unfamiliar, the documentation provides a good description of what it does. Comparing the type signatures, it’s more powerful than \(\mathrm{traverse\_}\) or \(\mathrm{for\_}\) in that the traversing function has access to the previous \(b\) created from an earlier action (using the extra power of \(\mathrm{Monad}\)), and we return the accumulated \(m \; b\) instead of discarding it.

We’re interested in how it’s implemented: it only has a \(\mathrm{Foldable}\) constraint on \(t\), and it’s specified to potentially short circuit after processing an initial sequence of elements, so it should be obvious that \(\mathrm{foldl}\) will not be appropriate here. The answer is to use \(\mathrm{foldr}\), and composition, but a different kind of composition.

We have a definition to fill in: \[ \begin{align*} &\mathrm{foldlM} \mathop{::} \left(\mathrm{Foldable} \; t, \mathrm{Monad} \; m\right) \implies \left(b \to a \to m \; b\right) \to b \to t \; a \to m \; b \\ &\mathrm{foldlM} \; (\ast) \; w \; xs = \mathrm{foldr} \; \left( \lambda \, x \; k \to \mathbf{\_} \right) \; \mathbf{\_} \; xs \; w \end{align*} \] (where I’ve chosen to rename the term variables \(a\) and \(b\) to \(x\) and \(k\), to avoid confusion with the types). We know that the type of \(x\) has to be \(a\), but what should the rest of the fold, \(k\), be?

Clearly we’re meant to provide the \(x\) values to \((\ast)\) somehow, so it might be useful to look at the right section \((\ast \; x)\), which has type \(b \to m \; b\). The general pattern of \(A \to m \; B\) (for any \(A\) and \(B\)) should feel familiar: even if you might not know what it’s called, it appears in the second argument to \(\mathop{\texttt{>>=}}\), so it’s fairly important in relation to monads: it’s called a Kleisli arrow. It’s like a function, only ‘lifted’ to monads: it’s a morphism. And just like functions, Kleisli arrows can be composed, using the \(\mathop{\texttt{<=<}}\) operator.

Furthermore, our type of \((\ast \; x)\) is \(b \to m \; b\): the input type of the morphism is \(b\), and the output type is \(b\), hence it’s an endomorphism. Everything sounds familiar to our \(\mathrm{foldl}\) example with endofunctions, so perhaps the pattern of composition is similar to our \(\mathrm{foldl}\) example? In that case, \(k\) should also be an endomorphism: it should have type \(b \to m \; b\). Let’s try it: \[ \mathrm{foldlM} \; (\ast) \; w \; xs = \mathrm{foldr} \; \left( \lambda \, x \; k \to k \; \mathop{\texttt{<=<}} \; (\ast \; x) \right) \; \mathbf{\_} \; xs \; w \text{.} \] It typechecks. We’re almost done: we just need an ‘identity function’, or an identity morphism: the identity element with respect to \(\mathop{\texttt{<=<}}\), and something of type \(A \to m \; A\), for any \(A\). That should sound familiar: it’s called \(\mathrm{pure}\): \[ \mathrm{foldlM} \; (\ast) \; w \; xs = \mathrm{foldr} \; \left( \lambda \, x \; k \to k \; \mathop{\texttt{<=<}} \; (\ast \; x) \right) \; \mathrm{pure} \; xs \; w \text{.} \] And we’re done.7

For further convincing that this definition has the left-to-right, short circuiting behaviour we want, we can choose to write it using \(\mathop{\texttt{>=>}}\) instead, which is just \(\mathop{\texttt{<=<}}\) but flipped: \[ \mathrm{foldlM} \; (\ast) \; w \; xs = \mathrm{foldr} \; \left( \lambda \, x \; k \to (\ast \; x) \; \mathop{\texttt{>=>}} \; k \right) \; \mathrm{pure} \; xs \; w \text{.} \] From those docs, we have a law that states \(\left(bs \; \mathop{\texttt{>=>}} \; cs\right) \; a = bs \; a \; \mathop{\texttt{>>=}} \; cs\), so we can eta-expand and mechanically deduce \[ \begin{align*} \mathrm{foldlM} \; (\ast) \; w \; xs &= \mathrm{foldr} \; \left( \lambda \, x \; k \to (\ast \; x) \; \mathop{\texttt{>=>}} \; k \right) \; \mathrm{pure} \; xs \; w \\ &= \mathrm{foldr} \; \left( \lambda \, x \; k \; z \to \left((\ast \; x) \; \mathop{\texttt{>=>}} \; k\right) \; z \right) \; \mathrm{pure} \; xs \; w \\ &= \mathrm{foldr} \; \left( \lambda \, x \; k \; z \to (\ast \; x) \; z \; \mathop{\texttt{>>=}} \; k \right) \; \mathrm{pure} \; xs \; w \\ &= \mathrm{foldr} \; \left( \lambda \, x \; k \; z \to \left(z \ast x\right) \; \mathop{\texttt{>>=}} \; k \right) \; \mathrm{pure} \; xs \; w \end{align*} \] and the use of \(\mathop{\texttt{>>=}}\), and the fact that the rest of the fold, \(k\), is the second operand to \(\mathop{\texttt{>>=}}\), should tell you that we have the short circuiting behaviour we want.

And this is the definition of \(\mathrm{foldlM}\) in base, only using (the identical) \(\mathrm{return}\) instead of \(\mathrm{pure}\), accompanied with the magical INLINE pragma pixie dust; but that is only a cosmetic sprinkling to us.

Perhaps it all doesn’t seem so magical now.


  1. This doesn’t quite work for right-to-left data structures, and the opposite actually becomes true (\(\mathrm{foldr}\) is expressible in terms of \(\mathrm{foldl}\)). But the \(\mathrm{Foldable}\) class and most of the Haskell ecosystem in general tends to neglect right-to-left data structures (they are fairly rare in practice: it’s far more common for a data structure to be efficiently traversable both left-to-right and right-to-left, rather than just the latter, and in the former case, we can express either fold in terms of the other), so we won’t worry about this point too much.↩︎

  2. The converse is not true (for a left-to-right data structure): the easiest way to see this is that \(\mathrm{foldl}\) always recurs on each element of the list, and can never terminate for infinite lists, no matter what function we provide, but \(\mathrm{foldr}\) can.↩︎

  3. Of course, it doesn’t have to be an operator (but the infix notation is more convenient for this blogpost), and it doesn’t necessarily have to be a ‘binary’ two-argument function either (it still needs to fit the type \(A \to B \to B\) or \(B \to A \to B\), but \(B\) can be a function).

    We also don’t need a specifically 3-element list, or a list at all: it’s just a convenient running example.↩︎

  4. Here, if \((\ast)\) is lazy in its first argument, then the reduction can terminate early. But when using \(\mathrm{foldl}\) with a left-to-right data structure, we have to iterate over the first elements of the list anyway, and we end up building the (potentially very big) inner applications before we start reducing (unless strictness is used), so we would still pay a space cost. If we want to reduce earlier, to potentially avoid that space cost (if \((\ast)\) is strict in its first argument and can reduce the space consumed; if it’s lazy, then we might accidentally change semantics), we would use \(\mathrm{foldl'}\) instead (and here, we would use strict composition instead).↩︎

  5. Verifying this is left as an exercise to the reader in equational reasoning, given the definition of \(\mathrm{foldr}\) (for lists): \[ \begin{align*} &\mathrm{foldr} \; g \; w \; [\,] = w \\ &\mathrm{foldr} \; g \; w \; \left(x : xs\right) = g \; x \; \left(\mathrm{foldr} \; g \; w \; xs \right) \text{.} \end{align*} \]↩︎

  6. Also left as a verification exercise.↩︎

  7. Kleisli endomorphisms form a monoid, where \(\mathop{\texttt{<=<}}\) is the associative operation, and \(\mathrm{pure}\) is the identity element. As an exercise, you can formulate this as a newtype, and combine it with \(\mathrm{Dual}\) to derive an equivalent definition, using \(\mathrm{foldMap}\) instead of \(\mathrm{foldr}\).↩︎