Home

Evaluation

Jason Cairns

2022-10-31

The evaluation of an expression in a programming language operates under a set of rules in which a rewriting of the original expression takes place. Of greatest interest when discussing evaluation strategies are the treatment of procedures, and in particular, the parameter-passing strategy and binding strategy for evaluation.

The mechanism of parameter passing can take several different forms, but is categorised at the highest level by whether the parameters passed to a function must be fully evaluated before a function is applied or not [1]. A strategy requiring full evaluation before function application is termed “strict” evaluation, while the converse is termed “non-strict” evaluation. Alternative terminology is “eager” and “lazy” evaluation, respectively[2]. Binding strategy is a dependent concept, in which the form of the parameters passed to the function are considered[3]. Binding strategies falling under strict evaluation involve passing to the function a fully evaluated value, an address to the value, or some other public interface. Non-strict binding strategies typically involve some variation on passing the name of a value that only resolves upon attempted access.

The challenge of evaluation in a distributed system is the question of how to handle arguments, errors, and the order of computation.

In many cases, the evaluation strategy doesn’t make any difference to the outcome of an evaluation, especially among simple pure functions with constant inputs. Consideration of evaluation strategies only became a major research topic upon determining the compilation of a subroutine call with arbitrary expressions rather than pure constants. It becomes more relevant upon engaging in computation with side-effects, where the order of operations may be the inverse of expectations. Furthermore, the creation of ad-hoc control structures is trivial with non-strict binding strategies, as well as infinite data structures and avoidance of certain error conditions[4].

R possesses an evaluation system which is unusual among most programming languages. Of the discussed parameter passing strategies, it is non-strict, in that arguments are not evaluated before the function they are passed to is applied[5]. The binding strategy falls into the category of “call-by-need”, where parameters are only evaluated upon their reference during the evaluation of the body of the function[6]. Call-by-need is a variation of the more general “call-by-name” strategy. Call-by-name involves passing the name, in some form, to the function, where, in the terminology of lambda calculus, it is beta-reduced with capture-avoiding substitution[7]. Call-by-need is effectively an optimisation of call-by-name; while call-by-name requires evaluation of the parameter every time it is referenced, call-by-need engages in memoisation, caching the value of the first evaluation of each parameter for successive reference.

Interestingly, call-by-need is more prevalent among functional programming languages, including Haskell, having roots in the lambda calculus, though it has been introduced more recently than other more common evaluation strategies[8]. Call-by-need gains unique abilities; most notably, the capacity for non-evaluation of parameters may bring performance improvements. In the extreme case, a function evaluated in a call-by-need manner may terminate in circumstances where strict evaluation will prevent termination. An example is given in lst. 1.

Listing 1: Evaluation of this program will terminate under call-by-need but not terminate under strict evaluation

> x <- function(...) NULL
> x(Sys.sleep(Inf))

Furthermore, the name of the passed parameter is made available through call-by-need and is commonly made use of in R, such as the deparse(substitute()) employed for the implicit labelling of plot axes, among other uses. Other abilities granted by way of call-by-name include the potential to create control structures, short circuit evaluation as typified by boolean operations, and infinite data structures. Drawbacks to call-by-need include potential inefficiencies in the implementation of the strategy, as well as associated complexities and difficulties in debugging[9].

The manner of implementation for call-by-need typically requires a data structure known as a “thunk”[10]. A thunk is a variation on a closure, keeping reference to the passed expression and an environment in which to evaluate the expression upon the thunk being accessed in the body of the procedure. Upon evaluation, the thunk is effectively replaced by its computed value, with further references accessing that value directly. R makes use of this form of processing, with thunks being referred to as “promises”, with the term “promise” not bearing any relation to the standard usage of “promise” to refer to the concurrent data structures associated with futures[11]. A promise in R encapsulates pointers to the value, expression, and environment. Promises are introduced in R through a list of promises replacing the argument list upon a call to a closure. A new environment is constructed for the evaluation of the closure, and the promises are matched to the names of the formal parameters. Evaluation of promises in a function is performed in the evaluation frame of the calling function, rather than the evaluation frame of the function, as default arguments are. Upon evaluation of a promise, the value pointer is set to the resulting value, the environment is set to NULL, and the header of the SEXP underlying the promise has the “PRSEEN” flag set to indicate that it has been accessed, in order to avoid recursive loops. In R nomenclature, the evaluation of a promise is known is “forcing” the promise, and from this point on in the evaluation of the closure, the promise serves the value only, with no further evaluation. As hinted at above, the expression is still capable of being accessed through the substitute() function. Evaluation of a promise only takes place when the value is needed, and promises may themselves refer to other promises, such as when unevaluated arguments are passed through calling functions, and these promises are recursively forced in turn.

Promises are used beyond function calls in R, including use for loading code from packages; loading packages attaches a namespace populated with promises which only evaluate upon access.

An exception to the evaluation strategy described here for R is that the “builtin” functions in R are evaluated strictly, with parameters being evaluated before being passed to the underlying C code.

At this point it is worth discussing other mechanisms for evaluation, for context.

Strict evaluation is typified by applicative order evaluation, which has rewrite rules corresponding to complete evaluation of all function arguments prior to evaluation of the function[3]. The principal binding strategies include call-by-value, call-by-reference, and object-oriented variations such as call-by-object. Call-by-value involves full evaluation of parameters, then calling the function with copies of the values of the parameters[12]. It is commonly used, with an example being C’s handling of non-pointer data. Call-by-reference requires passing parameter references to a procedure, allowing for mutability[13]. Examples include C’s handling of pointer data, as well as equivalent evaluations in Java and Rust. Notably, functional languages possessing immutable data have no difference in outcome between the two evaluation strategies. Call-by-object involves sharing an object passed as a parameter [14]. This evaluation strategy relates to call-by-reference, though differs in that passing a variable means passing access to the actual object, not access to the callers variable. In this way, a passed parameter is able to be mutated from within a function, though reassignment to the formal parameter doesn’t change the binding in the caller. Many object-oriented languages engage in a call-by-object strategy, including Python and Ruby.

Call-by-name and call-by-need are joined by other forms and mechanisms of lazy evaluation. One such strategy, known as optimistic evaluation, is a variation on call-by-need, wherein a functions parameters are evaluated under some time limit upon the application of that function, with any evaluations exceeding the time limit terminated and switched to a standard call-by-need evaluation strategy [15]. This approach is intended to capture the efficiency benefits of both call-by-need as well as strict evaluation. Another optimisation in lazy evaluation is given by graph reduction. Graph reduction involves treating expressions not only as abstract syntax trees, but as directed acyclic graphs, where multiple invocations of the same expression are depicted once only in the graph and linked together to avoid re-evaluation of the same expression[16]. An equivalency exists with graph reduction and the memoisation of call-by-need.

Given that distributing computation necessitates a rearrangement of evaluative computation, a question arises: How does evaluation proceed in a distributed system for R?

The SNOW package may be considered first for its pedigree. One of the fundamental procedures in SNOW is clusterCall()[17]. The evaluation of this procedure consists in first evaluating arguments on the master, with their values sent to slave nodes which execute the function call. This corresponds to a strict evaluation strategy, being an effective call-by-value. There is some nuance though, in that within the function being called on the slave node, a call-by-need strategy is still taking place, due to being applied by the R evaluator, but in practice with all arguments having been evaluated prior to function application, the “strict” classification is dominant. Other functions in SNOW follow a similar form.

Prior prototypes in the largescaler project developed herein follow a similar form. Evaluation of distributed procedures requires evaluation of arguments on arbitrary nodes in the network, with the values sent to the node executing the function call. Unlike SNOW, evaluation of arguments on arbitrary nodes in the network enables parallelism in argument evaluation, which is potentially more efficient. This corresponds to a rough call-by-future strategy. However, like SNOW, this effectively remains a strict evaluation strategy, due to requiring full parameter evaluation prior to function application, though still possesses some degree of subtlety. This subtlety leads to a somewhat orthogonal side-point; the form of generation of the function call differs from that in regular R. Promises are still used in the function call, so the originating expression is still in theory attainable. However, prior largescaleR prototypes have been generating function calls through the do.call() function, which constructs a call from a function and a list, then evaluates the call. This results in a mismatch in the original call and the actual call. An example of such a mismatch is given in listing lst. 2

Listing 2: A mismatch in an actual call and one generated with do.call

> x = function(...) match.call()
> y = 1
> x(y)
x(y)
> do.call(x, list(y))
(function(...) match.call())(1)

This is seen in the issue surrounding call capture in model fitting, expounded upon in a separate document.

Alternative means of performing evaluation is conceivable. For instance, a manner of evaluation similar to standard R could be enabled through making use of distributed promises. The evaluation or collection of the arguments to a function on a node would have to take place only upon those arguments being accessed in the evaluation of the function body. This implies a necessity to take advantage of promise objects. Such a capacity is already inherent in R, with delayedAssign() from the base package granting direct access to promise creation. Upon access of the promise objects, some operation should be enacted to emerge the underlying expression locally, including potential initiation of remote procedures. There are already limitations, however; One example is the execution of a remote distributed call with local arguments – if the call is to take place asynchronously, there is no immediate mechanism for the node executing the call to attain the local arguments as and when (and if) they are needed. This would require evaluation of the arguments before sending off the call. This may be able to be overcome if such local arguments were placed as promises in a local data store running asynchronously that evaluated the arguments prior to sending them off to requesters. All of this remains highly complex – one of the original points of opposition to thunks [10]among compiler constructors was the difficulty and complexity introduced through their usage, with the dearth of debuggers originally available as testament to this notion.

Another point to consider is that lazy evaluation is capable of being emulated in strict languages supporting closures or generators, through caching of values upon the constructs being called. However, this is less relevant to the case under consideration here, as it would require explicit calling of arguments in the definitions of functions to be evaluated in a distributed manner, which is undesirable when aiming for an API with greater conformance to standard R usage.

The side-point made about call construction, and illustrated in Listing lst. 2 may be met through capturing the original call and sending it along with references to the values to the executing node. Upon reception by the executing node, it may construct an outer evaluation frame with the passed variable names given as arguments in the captured expression being bound to their associated values (or distributed thunks)[10], and the expression evaluated within this outer evaluation frame. A drawback of the approach is evaluation of parameters prior to function application, but this may be sidestepped through the C API in a similar manner to the definition of ...length()[3].

Another aspect to consider with distributed arguments is that of alignment. A node may only engage in computation with data that resides on it at the point of computation. In order to attain the data, the data must exist somewhere in the cluster and possibly be transferred to the computing node. The arrangement of the underlying chunks corresponding to the distributed arguments to some function must be ordered in a very specific manner so that the relevent chunks which interact with each other, find themselves eventually placed on the same node. For instance, consider a computation involving two lists consisting of multiple elements, where the computation only requires the interaction between elements at the same index at any one time. In this case, each set of elements paired by index are required to be on the same node, but there are no further requirements regarding location. Such a computation is trivial to parallelise and distribute, with only one such requirement. The notion of ‘recycling’ is also pertinent to alignment. Many functions in R will repeat the elements in vector arguments to match the length of the longest. Emulating such functionality is decidedly non-trivial in a distributed system, however. The naive approach of repeating chunks breaks down entirely when chunks don’t hold a perfectly linear correspondence to the number of elements in the represented data structure. As such, the data structure has to be queried for its lengths, and chunks selected from and rearranged in order to perform this. Selection of elements also suffers from precisely this issue; knowing where the nth element lies in a distributed object requires a querying of chunks from the beginning onwards until it is found.

Errors also pose a significant challenge. Upon encountering one in a standard session, evaluation is halted and directed by the specific handling of the error. In a distributed system, errors take different forms, and can be impossible to detect until long after they take place. Errors may be captured, but this is insufficient if the information is not propogated back to the client node from whom the faulty request originated. Were every single computation to be monitored for errors and the result propagated to the client, this would represent a piece of infrastructure that has the to cause major inefficiencies if it involves any significant comunication or blocking. As such, an appropriate mechanism for handling errors is a non-trivial challenge to be solved.

[1]
E. Crank and M. Felleisen, “Parameter-passing and the lambda calculus,” in Proceedings of the 18th ACM SIGPLAN-SIGACT symposium on principles of programming languages, 1991, pp. 233–244. doi: 10.1145/99583.99616.
[2]
P. Henderson and J. H. Morris, “A lazy evaluator,” in Proceedings of the 3rd ACM SIGACT-SIGPLAN symposium on principles on programming languages, 1976, pp. 95–103. doi: 10.1145/800168.811543.
[3]
H. Abelson and G. J. Sussman, “4.2.1 normal order and applicative order,” in Structure and interpretation of computer programs, Cambridge, Mass. New York: MIT Press McGraw-Hill, 1996.
[4]
C. Okasaki, Purely functional data structures. Cambridge, U.K. New York: Cambridge University Press, 1998.
[5]
R. C. Team, “Argument evaluation,” in R internals, r-project.org, 2020.
[6]
Z. M. Ariola, J. Maraist, M. Odersky, M. Felleisen, and P. Wadler, “A call-by-need lambda calculus,” in Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on principles of programming languages, 1995, pp. 233–246. doi: 10.1145/199448.199507.
[7]
M. J. Fischer, “Lambda calculus schemata,” ACM SIGPLAN Notices, vol. 7, no. 1, pp. 104–109, 1972.
[8]
S. L. Nita and M. Mihailescu, “Introduction,” in Practical concurrent haskell: With big data applications, Berkeley, CA: Apress, 2017, pp. 3–11. doi: 10.1007/978-1-4842-2781-7_1.
[9]
H. Nilsson, “Tracing piece by piece: Affordable debugging for lazy functional languages,” in Proceedings of the fourth ACM SIGPLAN international conference on functional programming, 1999, pp. 36–47. doi: 10.1145/317636.317782.
[10]
P. Z. Ingerman, “Thunks: A way of compiling procedure statements with some comments on procedure declarations,” Commun. ACM, vol. 4, no. 1, pp. 55–58, Jan. 1961, doi: 10.1145/366062.366084.
[11]
R. C. Team, “Promise objects,” in R language definition, r-project.org, 2020.
[12]
G. D. Plotkin, “Call-by-name, call-by-value and the λ-calculus,” Theoretical computer science, vol. 1, no. 2, pp. 125–159, 1975.
[13]
F. Turbak, Design concepts in programming languages. Cambridge, Mass: MIT Press, 2008.
[14]
B. Liskov, R. Atkinson, T. Bloom, E. Moss, and C. Schaffert, “CLU reference manual.” MASSACHUSETTS INST OF TECH CAMBRIDGE LAB FOR COMPUTER SCIENCE, 1979.
[15]
R. Ennals and S. Peyton Jones, “Optimistic evaluation: A fast evaluation strategy for non-strict programs,” 2003, ACM International Conference on Functional Programming (ICFP’03).Available: https://www.microsoft.com/en-us/research/publication/optimistic-evaluation-fast-evaluation-strategy-non-strict-programs/
[16]
P. Hudak, “Conception, evolution, and application of functional programming languages,” ACM Comput. Surv., vol. 21, no. 3, pp. 359–411, Sep. 1989, doi: 10.1145/72551.72554.
[17]
L. Tierney, A. J. Rossini, N. Li, and H. Sevcikova, SNOW: Simple network of workstations. CRAN, 2018.