Bracket abstraction algorithms
I first came across combinatory logic when I was a postgraduate student at the University of Leeds where I was exploring Frege's ideas about the structure of ordinary language. Timothy Potts, a member of the Philosophy Department, was developing at that time what he termed Fregean categorial grammar and there seemed to me to be obvious similarities between that and Curry's application of the theory of functionality to the structure of language. (This is briefly discussed on pp. 274–275 of the first volume of Combinatory Logic by Curry and Feys.) Potts and Peter Geach, my then supervisor, were not keen on my fascination with combinatory logic as they both held that Curry, by treating functions as complete or saturated entities, had committed the cardinal sin against Fregean orthodoxy. I thought there was a way to answer this charge and continued to use combinatory logic to analyse the grammatical structure of language.
Several years later I became interested in functional programming languages while studying at the Programming Research Group in Oxford and learnt that they could be implemented by being translated into combinatory logic. At the heart of this translation is a bracket abstraction algorithm. Super-combinator and other methods of implementation were then being developed and the use of combinatory logic fell out of favour. I still believed, however, that the full potential of combinatory logic had not yet been revealed and, when I became a lecturer in the School of Computer Science at the University of Birmingham, I was fortunate to find a PhD student who shared my faith. David Stevens came up with the idea of representing combinators iconically. In standard notation the connection between a combinator and its behaviour has to be given by an arbitrary stipulation; in an iconic representation the behaviour of a combinator is read off from its depiction. David devised a particular kind of iconic representation, but I found one that I believe is better and that is what this webpage is devoted to explaining.
There are several systems of combinatory logic. The one used here is weak combinatory logic. Assume given an infinite sequence of symbols called variables and two constants, K and S, called basic combinators. Lowercase letters, sometimes decorated with subscripts, are used for variables. An atom is a variable or a constant. A term is defined thus:
- Every variable is a term;
- Every constant is a term;
- If P and Q are terms, so is (PQ).
Uppercase letters, sometimes decorated with subscripts, are used for terms. A term of the form (PQ) is an application, but the outermost pair of parentheses is usually omitted. Normally, no space is left between the terms of an application, but sometimes one will be inserted for clarity and readability. Application associates to the left, so PQRST is the same as (((PQ)R)S)T. The symbol ≡ represents syntactic identity: P ≡ Q means that P and Q are exactly the same term. Because combinatory logic contains no variable-binding operators every variable in a term is free: FV(P) represents the set of free variables in P.
A term of the form KPQ or SPQR is a redex. Contracting an instance of a redex in a term S means replacing one occurrence of KPQ by P or one occurrence of SPQR by PR(QR). Let the result be T. Then we say that S contracts to T, written S →1 T, and that T is the contractum. S is said to reduce to T, written S → T, iff T results from S by carrying out a finite (possibly zero) number of contractions. Combinators I, B, B', C, C' and S' can be defined in terms of K and S as follows: I := SKK, B := S(KS)K, B' := BB, C := S(BBS)(KK), C' := B(BC)B and S' := B(B(BS)S)K, where P := Q means that P is being defined as Q. We then have the following reduction properties: IP → P, BPQR → P(QR), B'PQRS → PQ(RS), CPQR → PRQ, C'PQRS → P(QS)R and S'PQRS → P(QS)(RS).
Substituting the term P for every free occurrence of x in X, written [P/x]X, is defined in the following way:
- [P/x]x ≡ P;
- [P/x]Y ≡ Y, if Y is an atom distinct from x;
- [P/x]QR ≡ ([P/x]Q) ([P/x]R).
Uni-variate bracket abstraction is a syntactic operation which removes a variable x from a term X, written [x]X, satisfying the property that ([x]X)P → [P/x]X. If [x]X = Q, then X is the input term and Q the abstract. Multi-variate bracket abstraction, written [x1, x2, ..., xa]X, removes several variables from a term X. There are two types of multi-variate abstraction: in the multi-sweep variety we have [x1, x2, ..., xa]X := [x1]([x2](...([xa]X)...)), whereas in the single-sweep variety the a variables are abstracted simultaneously in a single process. On this webpage, unless explicitly stated otherwise, bracket abstraction shall mean uni-variate abstraction.
One of the simplest abstraction algorithms is shown in Fig. 1; I call this algorithm (A). The clauses of this have to be applied in the order in which they appear in Fig. 1, starting at the top. In the first clause y has to be an atom distinct from the abstraction variable x. In the third clause the abstraction variable x can occur in the terms P and Q, but does not have to. (Curry and Feys discuss this algorithm on pp. 189–190 of the first volume of Combinatory Logic.) Unfortunately, it produces fairly long-winded abstracts as you can see for yourself by trying some examples. Only lowercase and uppercase letters and parentheses can be input and each opening parenthesis must be matched with a closing one. Abstraction is performed on the variable x when you click the button.
A better algorithm than (A) is shown in Fig. 2; it is better in the sense that it produces a shorter abstract when applied to the same input term. I refer to this new algorithm as algorithm (B). This is much discussed by Curry, Hindley and Seldin on pp. 42–67 of the second volume of Combinatory Logic; the labelling of the various clauses is due to them and they refer to this as the (abcdef) algorithm. In this the abstraction variable x cannot occur in the term E, but it has to occur in terms X and Y. Although (B) produces shorter abstracts than (A), it still produces quite long-winded abstracts as you can see for yourself.
It is possible to implement a functional programming language using either algorithm (A) or (B), but the resulting implementation is not very efficient. David Turner made the translation of a functional language into combinators practicable by making use of three additional combinators, namely B', C' and S', and appropriate abstraction clauses relating to them. His algorithm is often presented as shown in Fig. 3, but Turner's actual algorithm was slightly different. The differences are not important here. Most of the time algorithm (C) produces shorter abstracts than (B), but sometimes it does produce a longer abstract. Try it for yourself to see how it works.
Representing combinators iconically
Algorithm (L), to be presented shortly, represents combinators as strings of the iconic letters y and n called, not surprisingly, yn-strings. The letter φ is used for an arbitrary yn-string. size(φ) is the number of occurrences of y and n in φ and φi, for 1 ≤ i ≤ size(φ), is the ith letter in φ. String concatenation is represented by juxtaposition. The reduction properties of combinators represented by yn-strings are shown in Fig. 4.
In order to understand how algorithm (L) works you need to know that every combinatory-logic term P can be uniquely expressed in the form P1P2...Pm, where P1 is an atom and m ≥ 1. The Pi are known as the primal components of P. Algorithm (L) is shown in Fig. 5.
Performing bracket abstraction by hand is tedious and error-prone,
so experiment with algorithm (L) here to appreciate how it works.
Algorithm (L) has many nice properties. If the input term and the abstract are written using the fewest number of parentheses, then they both contain exactly the same number of parentheses. Algorithm (C) often adds many parentheses when producing an abstract. This make processing such abstracts less efficient. In a sense that is not easy to define precisely, algorithm (L) preserves the structure of the input term. It takes into account the natural way of representing a term of combinatory logic as a tree. Because of the properties it possesses, algorithm (L) gives rise to a very efficient single-sweep, multi-variate abstraction algorithm
Algorithm (L) is not the most efficient algorithm employing yn-strings. It can be improved to produce shorter abstracts by introducing clauses (b) and (c). The resulting algorithm, called (L+), is shown in Fig. 6. Clause (b), in particular, has a dramatic effect on reducing the size of abstracts that really occur when the algorithm is used to implement a functional language. In most of my papers, however, I focus on algorithm (L), rather than (L+), because it leads to a very elegant single-sweep, multi-variate algorithm.
Try algorithm (L+) for yourself to see how it works in practice.
There is a sense in which algorithm (C) is a sub-algorithm of (L+). Each of the combinators used in (C), except for I, can be represented using yn-strings: K is n, S is yy, B is ny, B' is nny, C is yn, C' is nyn and S' is nyy. If the yn-strings introduced by (L+) in an abstract occur in the above list, then (C) would produce an abstract that only differs in the names of the combinators that occur in it.
Comparison of bracket abstraction algorithms
To more fully appreciate how (L) and (L+) work is it useful to compare them against (C) and also against each other. Have a go at comparing (C) and (L), comparing (L) and (L+) and comparing (L+) and (C) to gain a deeper understanding of the meaning of the iconic combinators introduced here.
- M. W. Bunder, "Some Improvements to Turner's Algorithm for Bracket Abstraction", Journal of Symbolic Locic, vol. 55 (1990), pp. 656–669.
- Haskell B. Curry and Robert Feys, Combinatory Logic, vol. 1, [Amsterdam, North–Holland, 1958].
- Haskell B. Curry, J. R. Hindley and J. P. Seldin, Combinatory Logic, vol. 2, [Amsterdam, North–Holland, 1972].
- Antoni Diller, "Making Abstraction Behave by Rerepresenting Combinators", Research Report, School of Computer Science, University of Birmingham, CSR–99–12 (November 1999); a PDF version of this paper is available on this website.
- Antoni Diller, "Investigations into Iconic Representations of Combinators", in Javier Blanco (editor), Argentine Workshop on Theoretical Computer Science (WAIT2000) Proceedings: Tandil, September 4-9, 2000, [Buenos Aires, Sociedad Argentina de Informática e Investigación Operativa (SADIO), 2000], pp. 52–62; a PDF version of this paper is available on this website as is a summary.
"Efficient Multi-variate Abstraction Using an Array Representation for Combinators",
Information Processing Letters,
vol. 84 (2002), pp. 311–317; the
full text of this paper
is freely available.
It is discussed or mentioned in the following:
- Julien Cohen, Jean-Louis Giavitto and Olivier Michel, "Variable elimination for building interpreters"; I don't think this has been published yet.
- Julien Cohen, "Interprétation par SK-traduction et syntaxe abstraite d’ordre supérieur", in O. Michel (ed.), Journées Francophones des Langages Applicatifs (JFLA 2005) (2005), INRIA, pp. 17–34.
© Antoni Diller (21 March 2014)