Could someone explain? I understand the basic concepts behind them but I often see them used interchangeably and I get confused.
And now that we're here, how do they differ from a regular function?
A lambda is just an anonymous function - a function defined with no name. In some languages, such as Scheme, they are equivalent to named functions. In fact, the function definition is re-written as binding a lambda to a variable internally. In other languages, like Python, there are some (rather needless) distinctions between them, but they behave the same way otherwise.
A closure is any function which closes over the environment in which it was defined. This means that it can access variables not in its parameter list. Examples:
def func(): return h
def anotherfunc(h):
return func()
This will cause an error, because func
does not close over the environment in anotherfunc
- h
is undefined. func
only closes over the global environment. This will work:
def anotherfunc(h):
def func(): return h
return func()
Because here, func
is defined in anotherfunc
, and in python 2.3 and greater (or some number like this) when they almost got closures correct (mutation still doesn't work), this means that it closes over anotherfunc
's environment and can access variables inside of it. In Python 3.1+, mutation works too when using the nonlocal
keyword.
Another important point - func
will continue to close over anotherfunc
's environment even when it's no longer being evaluated in anotherfunc
. This code will also work:
def anotherfunc(h):
def func(): return h
return func
print anotherfunc(10)()
This will print 10.
This, as you notice, has nothing to do with lambdas - they are two different (although related) concepts.
There is a lot of confusion around lambdas and closures, even in the answers to this StackOverflow question here. Instead of asking random programmers who learned about closures from practice with certain programming languages or other clueless programmers, take a journey to the source (where it all began). And since lambdas and closures come from Lambda Calculus invented by Alonzo Church back in the '30s before first electronic computers even existed, this is the source I'm talking about.
Lambda Calculus is the simplest programming language in the world. The only things you can do in it:►
APPLICATION: Applying one expression to another, denoted f x. (Think of it as a function call, where f is the function and x is its only parameter)
ABSTRACTION: Binds a symbol occurring in an expression to mark that this symbol is just a "slot", a blank box waiting to be filled with value, a "variable" as it were. It is done by prepending a Greek letter λ (lambda), then the symbolic name (e.g. x), then a dot . before the expression. This then converts the expression into a function expecting one parameter. For example: λx.x+2 takes the expression x+2 and tells that the symbol x in this expression is a bound variable – it can be substituted with a value you supply as a parameter. Note that the function defined this way is anonymous – it doesn't have a name, so you can't refer to it yet, but you can immediately call it (remember application?) by supplying it the parameter it is waiting for, like this: (λx.x+2) 7. Then the expression (in this case a literal value) 7 is substituted as x in the subexpression x+2 of the applied lambda, so you get 7+2, which then reduces to 9 by common arithmetics rules.
So we've solved one of the mysteries:
lambda is the anonymous function from the example above, λx.x+2
.
function(x) { return x+2; }
and you can immediately apply it to some parameter like this:
(function(x) { return x+2; })(7)
or you can store this anonymous function (lambda) into some variable:
var f = function(x) { return x+2; }
which effectively gives it a name f
, allowing you to refer to it and call it multiple times later, e.g.:
alert( f(7) + f(10) ); // should print 21 in the message box
But you didn't have to name it. You could call it immediately:
alert( function(x) { return x+2; } (7) ); // should print 9 in the message box
In LISP, lambdas are made like this:
(lambda (x) (+ x 2))
and you can call such a lambda by applying it immediately to a parameter:
( (lambda (x) (+ x 2)) 7 )
closure
symbols
variables
As I said, what the lambda abstraction does is binding a symbol in its subexpression, so that it becomes a substitutible parameter. Such a symbol is called bound. But what if there are other symbols in the expression? For example: λx.x/y+2
. In this expression, the symbol x
is bound by the lambda abstraction λx.
preceding it. But the other symbol, y
, is not bound – it is free. We don't know what it is and where it comes from, so we don't know what it means and what value it represents, and therefore we cannot evaluate that expression until we figure out what y
means.
In fact, the same goes with the other two symbols, 2
and +
. It's just that we are so familiar with these two symbols that we usually forget that the computer doesn't know them and we need to tell it what they mean by defining them somewhere, e.g. in a library or the language itself.
You can think of the free symbols as defined somewhere else, outside the expression, in its "surrounding context", which is called its environment. The environment might be a bigger expression that this expression is a part of (as Qui-Gon Jinn said: "There's always a bigger fish" ;) ), or in some library, or in the language itself (as a primitive).
This lets us divide lambda expressions into two categories:
CLOSED expressions: every symbol that occurs in these expressions is bound by some lambda abstraction. In other words, they are self-contained; they don't require any surrounding context to be evaluated. They are also called combinators.
OPEN expressions: some symbols in these expressions are not bound – that is, some of the symbols occurring in them are free and they require some external information, and thus they cannot be evaluated until you supply the definitions of these symbols.
You can CLOSE an open lambda expression by supplying the environment, which defines all these free symbols by binding them to some values (which may be numbers, strings, anonymous functions aka lambdas, whatever…).
And here comes the closure part: The closure of a lambda expression is this particular set of symbols defined in the outer context (environment) that give values to the free symbols in this expression, making them non-free anymore. It turns an open lambda expression, which still contains some "undefined" free symbols, into a closed one, which doesn't have any free symbols anymore.
For example, if you have the following lambda expression: λx.x/y+2
, the symbol x
is bound, while the symbol y
is free, therefore the expression is open
and cannot be evaluated unless you say what y
means (and the same with +
and 2
, which are also free). But suppose that you also have an environment like this:
{ y: 3,
+: [built-in addition],
2: [built-in number],
q: 42,
w: 5 }
This environment supplies definitions for all the "undefined" (free) symbols from our lambda expression (y
, +
, 2
), and several extra symbols (q
, w
). The symbols that we need to be defined are this subset of the environment:
{ y: 3,
+: [built-in addition],
2: [built-in number] }
and this is precisely the closure of our lambda expression :>
In other words, it closes an open lambda expression. This is where the name closure came from in the first place, and this is why so many people's answers in this thread are not quite correct :P
Well, the corporate marketoids of Sun/Oracle, Microsoft, Google etc. are to blame, because that's what they called these constructs in their languages (Java, C#, Go etc.). They often call "closures" what are supposed to be just lambdas. Or they call "closures" a particular technique they used to implement lexical scoping, that is, the fact that a function can access the variables that were defined in its outer scope at the time of its definition. They often say that the function "encloses" these variables, that is, captures them into some data structure to save them from being destroyed after the outer function finishes executing. But this is just made-up post factum "folklore etymology" and marketing, which only makes things more confusing, because every language vendor uses its own terminology.
And it's even worse because of the fact that there's always a bit of truth in what they say, which does not allow you to easily dismiss it as false :P Let me explain:
If you want to implement a language that uses lambdas as first-class citizens, you need to allow them to use symbols defined in their surrounding context (that is, to use free variables in your lambdas). And these symbols must be there even when the surrounding function returns. The problem is that these symbols are bound to some local storage of the function (usually on the call stack), which won't be there anymore when the function returns. Therefore, in order for a lambda to work the way you expect, you need to somehow "capture" all these free variables from its outer context and save them for later, even when the outer context will be gone. That is, you need to find the closure of your lambda (all these external variables it uses) and store it somewhere else (either by making a copy, or by preparing space for them upfront, somewhere else than on the stack). The actual method you use to achieve this goal is an "implementation detail" of your language. What's important here is the closure, which is the set of free variables from the environment of your lambda that need to be saved somewhere.
It didn't took too long for people to start calling the actual data structure they use in their language's implementations to implement closure as the "closure" itself. The structure usually looks something like this:
Closure {
[pointer to the lambda function's machine code],
[pointer to the lambda function's environment]
}
and these data structures are being passed around as parameters to other functions, returned from functions, and stored in variables, to represent lambdas, and allowing them to access their enclosing environment as well as the machine code to run in that context. But it's just a way (one of many) to implement closure, not the closure itself.
As I explained above, the closure of a lambda expression is the subset of definitions in its environment that give values to the free variables contained in that lambda expression, effectively closing the expression (turning an open lambda expression, which cannot be evaluated yet, into a closed lambda expression, which can then be evaluated, since all the symbols contained in it are now defined).
Anything else is just a "cargo cult" and "voo-doo magic" of programmers and language vendors unaware of the real roots of these notions.
I hope that answers your questions. But if you had any follow-up questions, feel free to ask them in the comments, and I'll try to explain it better.
When most people think of functions, they think of named functions:
function foo() { return "This string is returned from the 'foo' function"; }
These are called by name, of course:
foo(); //returns the string above
With lambda expressions, you can have anonymous functions:
@foo = lambda() {return "This is returned from a function without a name";}
With the above example, you can call the lambda through the variable it was assigned to:
foo();
More useful than assigning anonymous functions to variables, however, are passing them to or from higher-order functions, i.e., functions that accept/return other functions. In a lot of these cases, naming a function is unecessary:
function filter(list, predicate)
{ @filteredList = [];
for-each (@x in list) if (predicate(x)) filteredList.add(x);
return filteredList;
}
//filter for even numbers
filter([0,1,2,3,4,5,6], lambda(x) {return (x mod 2 == 0)});
A closure may be a named or anonymous function, but is known as such when it "closes over" variables in the scope where the function is defined, i.e., the closure will still refer to the environment with any outer variables that are used in the closure itself. Here's a named closure:
@x = 0;
function incrementX() { x = x + 1;}
incrementX(); // x now equals 1
That doesn't seem like much but what if this was all in another function and you passed incrementX
to an external function?
function foo()
{ @x = 0;
function incrementX()
{ x = x + 1;
return x;
}
return incrementX;
}
@y = foo(); // y = closure of incrementX over foo.x
y(); //returns 1 (y.x == 0 + 1)
y(); //returns 2 (y.x == 1 + 1)
This is how you get stateful objects in functional programming. Since naming "incrementX" isn't needed, you can use a lambda in this case:
function foo()
{ @x = 0;
return lambda()
{ x = x + 1;
return x;
};
}
Not all closures are lambdas and not all lambdas are closures. Both are functions, but not necessarily in the manner we're used to knowing.
A lambda is essentially a function that is defined inline rather than the standard method of declaring functions. Lambdas can frequently be passed around as objects.
A closure is a function that encloses its surrounding state by referencing fields external to its body. The enclosed state remains across invocations of the closure.
In an object-oriented language, closures are normally provided through objects. However, some OO languages (e.g. C#) implement special functionality that is closer to the definition of closures provided by purely functional languages (such as lisp) that do not have objects to enclose state.
What's interesting is that the introduction of Lambdas and Closures in C# brings functional programming closer to mainstream usage.
It's as simple as this: lambda is a language construct, i.e. simply syntax for anonymous functions; a closure is a technique to implement it -- or any first-class functions, for that matter, named or anonymous.
More precisely, a closure is how a first-class function is represented at runtime, as a pair of its "code" and an environment "closing" over all the non-local variables used in that code. This way, those variables are still accessible even when the outer scopes where they originate have already been exited.
Unfortunately, there are many languages out there that do not support functions as first-class values, or only support them in crippled form. So people often use the term "closure" to distinguish "the real thing".
From the view of programming languages, they are completely two different things.
Basically for a Turing complete language we only needs very limited elements, e.g. abstraction, application and reduction. Abstraction and application provides the way you can build up lamdba expression, and reduction dertermines the meaning of the lambda expression.
Lambda provides a way you can abstract the computation process out. for example, to compute the sum of two numbers, a process which takes two parameters x, y and returns x+y can be abstracted out. In scheme, you can write it as
(lambda (x y) (+ x y))
You can rename the parameters, but the task that it completes doesn't change. In almost all of programming languages, you can give the lambda expression a name, which are named functions. But there is no much difference, they can be conceptually considered as just syntax sugar.
OK, now imagine how this can be implemented. Whenever we apply the lambda expression to some expressions, e.g.
((lambda (x y) (+ x y)) 2 3)
We can simply substitute the parameters with the expression to be evaluated. This model is already very powerful. But this model doesn't enable us to change the values of symbols, e.g. We can't mimic the change of status. Thus we need a more complex model. To make it short, whenever we want to calculate the meaning of the lambda expression, we put the pair of symbol and the corresponding value into an environment(or table). Then the rest (+ x y) is evaluated by looking up the corresponding symbols in the table. Now if we provide some primitives to operate on the environment directly, we can model the changes of status!
With this background, check this function:
(lambda (x y) (+ x y z))
We know that when we evaluate the lambda expression, x y will be bound in a new table. But how and where can we look z up? Actually z is called a free variable. There must be an outer an environment which contains z. Otherwise the meaning of the expression can't be determined by only binding x and y. To make this clear, you can write something as follows in scheme:
((lambda (z) (lambda (x y) (+ x y z))) 1)
So z would be bound to 1 in an outer table. We still get a function which accepts two parameters but the real meaning of it also depends on the outer environment. In other words the outer environment closes on the free variables. With the help of set!, we can make the function stateful, i.e, it's not a function in the sense of maths. What it returns not only depends on the input, but z as well.
This is something you already know very well, a method of objects almost always relies on the state of objects. That's why some people say "closures are poor man's objects. " But we could also consider objects as poor man's closures since we really like first class functions.
I use scheme to illustrate the ideas due to that scheme is one of the earliest language which has real closures. All of the materials here are much better presented in SICP chapter 3.
To sum up, lambda and closure are really different concepts. A lambda is a function. A closure is a pair of lambda and the corresponding environment which closes the lambda.
Concept is same as described above, but if you are from PHP background, this further explain using PHP code.
$input = array(1, 2, 3, 4, 5);
$output = array_filter($input, function ($v) { return $v > 2; });
function ($v) { return $v > 2; } is the lambda function definition. We can even store it in a variable, so it can be reusable:
$max = function ($v) { return $v > 2; };
$input = array(1, 2, 3, 4, 5);
$output = array_filter($input, $max);
Now, what if you want to change the maximum number allowed in the filtered array? You would have to write another lambda function or create a closure (PHP 5.3):
$max_comp = function ($max) {
return function ($v) use ($max) { return $v > $max; };
};
$input = array(1, 2, 3, 4, 5);
$output = array_filter($input, $max_comp(2));
A closure is a function that is evaluated in its own environment, which has one or more bound variables that can be accessed when the function is called. They come from the functional programming world, where there are a number of concepts in play. Closures are like lambda functions, but smarter in the sense that they have the ability to interact with variables from the outside environment of where the closure is defined.
Here is a simpler example of PHP closure:
$string = "Hello World!";
$closure = function() use ($string) { echo $string; };
$closure();
Nicely explained in this article.
This question is old and got many answers. Now with Java 8 and Official Lambda that are unofficial closure projects, it revives the question.
The answer in Java context (via Lambdas and closures — what’s the difference?):
"A closure is a lambda expression paired with an environment that binds each of its free variables to a value. In Java, lambda expressions will be implemented by means of closures, so the two terms have come to be used interchangeably in the community."
Simply speaking, closure is a trick about scope, lambda is an anonymous function. We can realize closure with lambda more elegantly and lambda is often used as a parameter passed to a higher function
A Lambda expression is just an anonymous function. in plain java, for example, you can write it like this:
Function<Person, Job> mapPersonToJob = new Function<Person, Job>() {
public Job apply(Person person) {
Job job = new Job(person.getPersonId(), person.getJobDescription());
return job;
}
};
where the class Function is just built in java code. Now you can call mapPersonToJob.apply(person)
somewhere to use it. thats just one example. Thats a lambda before there was syntax for it. Lambdas a short cut for this.
Closure:
a Lambda becomes a closure when it can access the variables outside of this scope. i guess you can say its magic, it magically can wrap around the environment it was created in and use the variables outside of its scope(outer scope. so to be clear, a closure means a lambda can access its OUTER SCOPE.
in Kotlin, a lambda can always access its closure (the variables that are in its outer scope)
Lambda vs Closure
Lambda
is anonymous function(method)
Closure
is function which closes over(capture) variables from its enclosing scope(e.g. non-local variables)
Java
interface Runnable {
void run();
}
class MyClass {
void foo(Runnable r) {
}
//Lambda
void lambdaExample() {
foo(() -> {});
}
//Closure
String s = "hello";
void closureExample() {
foo(() -> { s = "world";});
}
}
Swift[Closure]
class MyClass {
func foo(r:() -> Void) {}
func lambdaExample() {
foo(r: {})
}
var s = "hello"
func closureExample() {
foo(r: {s = "world"})
}
}
There are many noises of technically vague or "not even wrong" artificial pearls in various existing answers to this question, so I'd finally add a new one...
Clarification on the terminology
It is better to know, the terms "closure" and "lambda" can both denote different things, contextually dependently.
This is a formal issue because the specification of the PL (programming language) being discussed may define such terms explicitly.
For example, by ISO C++ (since C++11):
The type of a lambda-expression (which is also the type of the closure object) is a unique, unnamed non-union class type, called the closure type, whose properties are described below.
As users of C-like languages are daily confusing with "pointers" (types) to "pointer values" or "pointer objects" (inhabitants of types), there are risks to getting confused here, too: most C++ users are actually talking about "closure objects" by using the term "closure". Be cautious to the ambiguity.
NOTE To make things generally clearer and more precise, I'd seldom deliberately use some language-neutral terms (usually specific to the PL theory instead of the language-defined terminology. For instance, type inhabitant used above covers the language-specific "(r)values" and "lvalues" in a broader sense. (Since the syntactic essence of C++'s value category definition is irrelevant, avoiding "(l/r)values" may reduce confusion). (Disclaimer: lvalues and rvalues are common enough in many other contexts.) Terms not defined formally among different PLs may be in quotes. Verbatim copy from referenced materials may be also in quotes, with typos unchanged.
This is even more relevant to "lambda". The (small case) letter lambda (λ) is an element of the Greek alphabet. By compared with "lambda" and "closure", one is certainly not talking about the letter itself, but something behind the syntax using "lambda"-derived concepts.
The relevant constructs in modern PLs are usually named as "lambda expressions". And it is derived from the "lambda abstractions", discussed below.
Before the detailed discussions, I recommend reading some comments of the question itself. I feel they safer and more helpful than most answers of the question here, in the sense of less risks of getting confused. (Sadly, this is the most significant reason I decide to provide an answer here...)
Lambdas: a brief history
The constructs named of "lambda" in PLs, whatever "lambda expression" or something others, are syntactic. In other words, users of the languages can find such source language constructs which are used to build something others. Roughly, the "others" are just "anonymous functions" in practice.
Such constructs are originated from lambda abstractions, one of the three syntax categories ("kinds of expressions") of the (untyped) lambda calculus developed by A. Church.
Lambda calculus is a deducing system (more precisely, a TRS (term rewrite system)) to model computation universally. To reduce a of lambda term is just like to evaluate an expression in normal PLs. With the built-in reduction rules, it is sufficient to define the various ways to compute. (As you may know, it is Turing-complete.) Hence, it can be used as a PL.
NOTE Evaluating an expression in a PL is not interchangable to reducing a term in a TRS in general. However, lambda calculus is a language with all reduction results expressible within the source language (i.e. as lambda terms), so they have same meaning coincidentally. Almost all PLs in practice do not have this property; the calculus to describe their semantics may contain terms not being the source lanugage expressions, and reductions may have more detailed effects than evaluations.
Every terms ("expressions") in the lambda calculus (lambda terms) are either variable, abstraction or application. "Variable" here is the syntax (just the variable's name) of symbol, which can refer to an existing "variable" (semantically, an entity which may reduce to some other lambda term) introduced previously. The ability to introduce a variable is provided by the abstraction syntax, which has a leading letter λ, followed by a bound variable, a dot and a lambda term. The bound variable is similar to the formal parameter name both in the syntax and semantic among many languages, and the followed lambda term inside the lambda abstraction is just like the function body. The application syntax combines a lambda term ("actual argument") to some abstraction, like the function call expression in many PLs.
NOTE A lambda abstraction can introduce only one parameter. To overcome the limitation inside the calculus, see Currying.
The ability of introducing variables makes lambda calculus a typical high-level language (albeit simple). On the other hand, combinatory logics can be treated as PLs by remove away the variable and abstraction features from the lambda calculus. Combinatory logics are low-level exactly in this sense: they are like plain-old assembly languages which do not allow to introduce variables named by the user (despite macros, which requires additional preprocessing). (... If not more low-level... typically assembly languages can at least introduce user-named labels.)
Noticing the lambda abstraction can be built in-place inside any other lambda terms, without the need to specify a name to denote the abstraction. So, the lambda abstraction in a whole forms the anonymous function (probably nested). This is a quite high-level feature (compared to, e.g. ISO C, which does not allow anonymous or nested functions).
The successor of the untyped lambda calculus include various typed lambda calculi (like the lambda cube). These are more like statically typed languages which requires type annotations on the formal parameters of functions. Nevertheless, the lambda abstractions still have the same roles here.
Although lambda calculi are not intended to be directly used as PLs implemented in computers, they do have affected PLs in practice. Notably, J. McCarthy introduced the LAMBDA
operator in LISP to provide functions exactly following the idea of Church's untyped lambda calculus. Apparently, the name LAMBDA
comes from the letter λ. LISP (later) has a different syntax (S-expression), but all programmable elements in the LAMBDA
expressions can be directly mapped to the lambda abstractions in the untyped lambda calculus by trivial syntactic conversions.
On the other hand, many other PLs express similar functionalities by other means. A slightly different way to introduce reusable computations are named functions (or more exactly, named subroutines), which are supported by earlier PLs like FORTRAN, and languages derived from ALGOL. They are introduced by syntaxes specifying a named entity being a function at the same time. This is simpler in some sense compared to LISP dialects (esp. in the aspect of implementation), and it seems more popular than LISP dialects for decades. Named functions may also allow extensions not shared by anonymous functions like function overloading.
Nevertheless, more and more industrial programmers finally find the usefulness of first-class functions, and demands of the ability to introduce function definitions in-place (in the expressions in the arbitrary contexts, say, as an argument of some other function) are increasing. It is natural and legitimate to avoid naming a thing not required to be, and any named functions fail here by definition. (You may know, naming things correctly is one of the well-known hard problems in the computer science.) To address the problem, anonymous functions are introduced to languages traditionally only providing named functions (or function-like constructs like "methods", whatsoever), like C++ and Java. Many of them name the feature as "lambda expressions" or similar lambda things, because they are basically reflecting the essentially same idea in lambda calculi. Renaissance.
A bit disambiguity: in the lambda calculus, all terms (variables, abstractions and applications) are effectively expressions in a PL; they are all "lambda expressions" in this sense. However, PLs adding lambda abstraction to enrich their features may speficially name the syntax of the abstraction as the "lambda expression", to distinguish with existing other kinds of expressions.
Closures: the history
Closures in mathematics is not the same to it in PLs.
In the latter context, the term is coined by P. J. Landin in 1964, to providing the support of the first-class functions in the implementation of evaluating the PLs "modelled in Church's λ-notation".
Specific to the model proposed by Landin (the SECD machine), a closure is comprising the λ-expression and the environment relative to which it was evaluated, or more precisely:
an environment part which is a list whose two items are (1) an environment (2) an identifier of list of identifiers
and a control part which consists of a list whose sole item is an AE
NOTE AE is abbreviated for applicative expression in the paper. This is the syntax exposing more or less the same functionality of application in the lambda calculus. There are also some additional pieces of details like "applicative" not that interesting in the lambda calculus (because it is purely functional), though. SECD is not consistent with the original lambda calculus for these minor differences. For example, SECD halts on arbitrary single lambda abstraction whether the subterm ("body") has a normal form, because it will not reduce the subterm ("evaluate the body") without the abstraction has been applied ("called"). However, such behavior may be more like the PLs today than the lambda calculus. SECD is also not the only abstract machine can evaluate lambda terms; although most other abstract machines for the similar purpose may also have environments. Contrast to the lambda calculus (which is pure), these abstract machines can support mutation in some degrees.
So, in this specific context, a closure is an internal data structure to implement specific evaluations of PLs with AEs.
The discipline of accessing the variables in closures reflects lexical scoping, first used in early 1960s by the imperative language ALGOL 60. ALGOL 60 does support nested procedures and passing procedures to parameters, but not returning procedures as results. For languages has full support of first-class functions which can be returned by functions, the static chain in ALGOL 60-style implementations does not work because free variables used by the function being returned may be no longer present on the call stack. This is the upwards funarg problem. Closures resolve the problem by capturing the free variable in the environment parts and avoiding allocating them on the stack.
On the other hand, early LISP implementations all use dynamic scope. This makes the variable bindings referenced all reachable in the global store, and name hiding (if any) is implemented as per-variable basis: once a variable is created with an existing name, the old one is backed by a LIFO structure; in other words, each variable's name can access a corresponding global stack. This effectively cancels the need of the per-function environments because no free variables are ever captured in the function (they are already "captured" by the stacks).
Despite imitating the lambda notation at the first, LISP is very different to the lambda calculus here. The lambda calculus is statically scoped. That is, each variable denotes to the instance bounded by the nearest same named-formal parameter of a lambda abstraction which contains the variable before its reduction. In the semantics of lambda calculus, reducing an application substitute the term ("argument") to the bound variable ("formal parameter") in the abstraction. Since all values can be represented as lambda terms in the lambda calculus, this can be done by direct rewriting by replacing specific subterms in each step of the reducing.
NOTE So, environments are not essential to reduce the lambda terms. However, a calculus extending the lambda calculus can introduce the environments explicitly in the grammar, even when it only models pure computations (without mutation). By adding environments explicitly, there can be dedicated rules of constraints on the environments to enforce environment normalizations which strengthens the equational theory of the calculus. (See [Shu10] §9.1.)
LISP is quite different, because its underlying semantic rules are based on neither lambda calculus nor term rewriting. Therefore, LISP needs some different mechanism to maintaining the scoping discipline. It adopted the mechanism based on the environment data structures saving the variable to value mappings (i.e. variable bindings). There may be more sophisticated structure in an environment in new variants of LISP (e.g. lexically scoped Lisp allows mutations), but the simplest structure conceptually equivalent to the environment defined by the Landin's paper, discussed below.
LISP implementations do support first-class functions at the very early era, but with pure dynamic scoping, there is no real funargs problem: they can just avoid the allocations on the stack and letting a global owner (the GC, garbage collector) to manage the resources in the environments (and activation records) referencing the variables. Closures are not needed then. And this is the early implementations prior to the invention of closures do.
Deep binding which approximates static (lexical) binding was introduced around 1962 in LISP 1.5, via the FUNARG
device. This finally made the problem well-known under the name "funarg problem".
NOTE AIM-199 points out that this is essentially about the environments.
Scheme is the first Lisp dialect supporting lexical scoping by default (dynamic scope can be simulated by make-parameter
/parameterize
forms in modern versions of Scheme). There were some debates in a later decade, but finally most Lisp dialects adopt the idea to default to lexical scoping, as many other languages do. Since then, closure, as an implementation technique, are more widely spread and more popular among PLs of different flavors.
Closures: the evolution
The original paper of Landin first defines an environment being a mathematical function mapping the name ("constant") to the named object ("primitive"). Then, it specifies the environment as "a list-structure made up of name/value pairs". The latter is also implemented in early Lisp implementation as alists (associative lists), but modern language implementations do not necessarily follow such detail. In particular, environments can be linked to support nested closures, which is unlikely directly supported by abstract machines like SECD.
Besides the environment, the other component of the "environment part" in Landin's paper is used to keep the names of bound variable(s) of the lambda abstractions (the formal parameter(s) of the functions). This is also optional (and likely missing) for modern implementations where the names of the parameters can be statically optimized away (spiritually granted by the alpha-renaming rules of lambda calculi), when there is no need to reflect the source information.
Similarly, modern implementations may not save the syntactic constructs (AEs or lambda terms) directly as the control part. Instead, they may use some internal IR (intermediate representation) or the "compiled" form (e.g. FASL used by some implementations of Lisp dialects). Such IR is even not guaranteed to be generated from lambda
forms (e.g. it can come from the body of some named functions).
Moreover, the the environment part can save other information not for the evaluation for the lambda calculi. For example, it can keep an extra identifier to provide additional binding naming the environment at the call site. This can implement languages based on extensions of lambda calculi.
Revisit of PL-specific terminology
Further, some languages may define "closure"-related terms in their specification to name entities may be implemented by closures. This is unfortunate because it leads to many misconceptions like "a closure is a function". But fortunately enough, most languages seem to avoid to name it directly as a syntactic construct in the language.
Nevertheless, this is still better than the overloading more well-established common concepts arbitrary by language specifications. To name a few:
"objects" are redirected to "instance of classes" (in Java/CLR/"OOP" languages) instead of traditional "typed storage" (in C and C++) or just "values" (in many Lisps);
"variables" are redirected to something traditional called "objects" (in Golang) as well as mutable states (in many new languages), so it is no longer compatible to mathematics and pure functional languages;
"polymorphism" is restricted to inclusion polymorphism (in C++/"OOP" languages) even these languages do have other kinds of polymorphism (parametric polymorphism and ad-hoc polymorphism).
About the resource mangagement
Despite the components being ommited in modern implementations, the definitions in Landin's paper are fairly flexible. It does not limits how to store the components like the environments out of the contexts of SECD machine.
In practice, various strategies are used. The most common and traditional way is making all resources owned by a global owner which can collect the resources not any longer in use, i.e. the (global) GC, first used in the LISP.
Other ways may not need a global owner and have better locality to the closures, for example:
In C++, resources of entities captured in closures are allowed being managed explicitly by users, by specifying how to capture each variable in the lambda-expression's capture list (by value copy, by reference, or even by an explicit initializer) and the exact type of each variable (smart pointers or other types). This can be unsafe, but it gains more flexibility when used correctly.
In Rust, resources are captured with different capture modes (by immutable borrow, by borrow, by move) tried in turn (by the implementation), and users can specify explicit move. This is more conservative than C++, but safer in some sense (since borrows are statically checked, compared to unchecked by-reference captures in C++).
All the strategies above can support closures (C++ and Rust do have the language-specific definitions of the concept "closure type"). The disciplines to manage the resources used by the closures have nothing to do of the qualification of the closures.
So, (although not seen here,) the claim of the necessity of graph tracing for closures by Thomas Lord at LtU is also technically incorrect. Closures can solve the funarg problem because it allows preventing invalid accesses to the activation record (the stack), but the fact does not magically assert every operations on the resources comprising the closure will be valid. Such mechanism depend on the external execution environment. It should be clear, even in traditional implementations, the implicit owner (GC) is not a component in the closures, and the existence of the owner is the implementation detail of SECD machine (so it is one of the "high-order" details to the users). Whether such detail supports graph tracing or not has no effects on the qualification of closures. Besides, AFAIK, the language constructs let
combined with rec
is first introduced (again by P. Landin) in ISWIM in 1966, which could not have effects to enforce the original meaning of the closures invented earlier than itself.
The relationships
So, to sum them up, a closure can be (informally) defined as:
(1) a PL implementation-specific data structure comprising as an environment part and a control part for a function-like entity, where:
(1.1) the control part is derived from some source language constructs specifying the evaluation construct of the function-like entity;
(1.2) the environment part is comprised by an environment and optionally other implementation-defined data;
(1.3) the environment in (1.2) is determined by the potentially context-dependent source language constructs of the function-like entity, used to hold the captured free variables occurs in the evaluation construct of the source language constructs creating the function-like entity.
(2) alternatively, the umbrella term of an implementation technique to utilize the entities named "closures" in (1).
Lambda expressions (abstractions) are just one of the syntactic constructs in the source language to introduce (to create) unnamed function-like entities. A PL may provide it as the only way introduce the function-like entity.
In general, there are no definite correspondence between lambda expressions in the source program and the existence of the closures in the execution of the program. As implementation details having no effects on the observable behavior of the program, a PL implementation is usually allowed to merge resources allocated for closures when possible, or totally omitting creating them when it does not matter on the program semantics:
The implementation can check the set of the free variables to be captured in the lambda expression, and when the set is empty, it can avoid introducing the environment part, so the function-like entity will not require a closure to be maintained. Such strategy is usually mandated in the rules of static languages.
Otherwise, the implementation may or may not always create a closure for a function-like entity resulted by evaluating the lambda expression whether there are variables to be captured.
Lambda expressions may be evaluates to the function-like entity. Users of some PLs may call such a function-like entity a "closure". "Anonymous function" should be a more neutral name of such "closure" in this context.
Appendix: functions: the messy history
This is not directly tied to the problem, but it may also worth noting "functions" can name different entities in different contexts.
It is already a mess in mathematics.
Currently I am too lazy to sum them up in the contexts of PLs, but as a caveat: keep an eye on the context to make sure the various definitions of "function" in different PLs not making your reasoning biased from the topic.
As of the use "anonymous functions" in general (shared by PLs in practice), I believe it will not introduce significant confusions and misconceptions on this topic, though.
Named functions may have a slightly more problems. Functions may denote the entity of the name themselves (the "symbols"), as well as the evaluated values of these names. Given that the fact that most PLs don't have unevaluated context to differentiate a function with some other entities carrying interesting meaning (e.g. sizeof(a_plain_cxx_function)
in C++ just ill-formed), users may not observe the differences of the misinterpretation between unevaluated operand and evaluated values. That will be problematic with some Lisp dialects having QUOTE
. Even experienced PL specialists can easily miss something important; this is also why I emphasize to distinguish syntactic constructs with other entities.
It depends on whether a function uses external variable or not to perform operation.
External variables - variables defined outside the scope of a function.
Lambda expressions are stateless because It depends on parameters, internal variables or constants to perform operations. Function
Closures hold state because it uses external variables (i.e. variable defined outside the scope of the function body) along with parameters and constants to perform operations. int n = 2
Function
When Java creates closure, it keeps the variable n with the function so it can be referenced when passed to other functions or used anywhere.
The question is 12 years old and we still get it as the first link in Google for “closures vs lambda”. So I have to say it as no one did explicitly.
Lambda expression is an anonymous function (declaration).
And a closure, quoting Scott's Programming Language Pragmatics is explained as:
… creating an explicit representation of a referencing environment (generally the one in which the subroutine would execute if called at the present time) and bundling it together with a reference to the subroutine … is referred to as a closure.
That is, it is just as we call the bundle of “function + surrendering context”.
Lambda is an anonymous function definition that is not (necessarily) bound to an identifier.
"Anonymous functions originate in the work of Alonzo Church in his invention of the lambda calculus, in which all functions are anonymous" - Wikipedia
Closure is the lambda function implementation.
"Peter J. Landin defined the term closure in 1964 as having an environment part and a control part as used by his SECD machine for evaluating expressions" - Wikipedia
The generic explanation of Lambda and Closure is covered in the other responses.
For those from a C++ background, Lambda expressions were introduced in C++11. Think of Lambdas as a convenient way to create anonymous functions and function objects.
"The distinction between a lambda and the corresponding closure is precisely equivalent to the distinction between a class and an instance of the class. A class exists only in source code; it doesn’t exist at runtime. What exists at runtime are objects of the class type. Closures are to lambdas as objects are to classes. This should not be a surprise, because each lambda expression causes a unique class to be generated (during compilation) and also causes an object of that class type, a closure to be created (at runtime)." - Scott Myers
C++ allows us to examine the nuances of Lambda and Closure as you have to explicitly specify the free variables to be captured.
In the sample below, the Lambda expression has no free variables, an empty capture list ([]
). It’s essentially an ordinary function and no closure is required in the strictest sense. So it can even be passed as a function pointer argument.
void register_func(void(*f)(int val)) // Works only with an EMPTY capture list
{
int val = 3;
f(val);
}
int main()
{
int env = 5;
register_func( [](int val){ /* lambda body can access only val variable*/ } );
}
As soon as a free variable from the surrounding environment is introduced in the capture list ([env]
), a Closure has to be generated.
register_func( [env](int val){ /* lambda body can access val and env variables*/ } );
Since this is no longer an ordinary function, but a closure instead, it produces a compilation error.
no suitable conversion function from "lambda []void (int val)->void" to "void (*)(int val)" exists
The error can be fixed with a function wrapper std::function
which accepts any callable target including a generated closure.
void register_func(std::function<void(int val)> f)
See Lambda and Closure for a detailed explanation with a C++ example.
Success story sharing