Are there any guidelines in Scala on when to use val with a mutable collection versus using var with an immutable collection? Or should you really aim for val with an immutable collection?
The fact that there are both types of collection gives me a lot of choice, and often I don't know how to make that choice.
Pretty common question, this one. The hard thing is finding the duplicates.
You should strive for referential transparency. What that means is that, if I have an expression "e", I could make a val x = e
, and replace e
with x
. This is the property that mutability break. Whenever you need to make a design decision, maximize for referential transparency.
As a practical matter, a method-local var
is the safest var
that exists, since it doesn't escape the method. If the method is short, even better. If it isn't, try to reduce it by extracting other methods.
On the other hand, a mutable collection has the potential to escape, even if it doesn't. When changing code, you might then want to pass it to other methods, or return it. That's the kind of thing that breaks referential transparency.
On an object (a field), pretty much the same thing happens, but with more dire consequences. Either way the object will have state and, therefore, break referential transparency. But having a mutable collection means even the object itself might lose control of who's changing it.
If you work with immutable collections and you need to "modify" them, for example, add elements to them in a loop, then you have to use var
s because you need to store the resulting collection somewhere. If you only read from immutable collections, then use val
s.
In general, make sure that you don't confuse references and objects. val
s are immutable references (constant pointers in C). That is, when you use val x = new MutableFoo()
, you'll be able to change the object that x
points to, but you won't be able to change to which object x
points. The opposite holds if you use var x = new ImmutableFoo()
. Picking up my initial advice: if you don't need to change to which object a reference points, use val
s.
var immutable = something(); immutable = immutable.update(x)
defeats the purpose of using an immutable collection. You've already given up referential transparency and you can usually get the same effect from a mutable collection with better time complexity. Of the four possibilities (val
and var
, mutable and immutable), this one makes the least sense. I do often use val mutable
.
var list: List[X] = Nil; list = item :: list; ...
and I'd forgotten that I once wrote differently.
The best way to answer this is with an example. Suppose we have some process simply collecting numbers for some reason. We wish to log these numbers, and will send the collection to another process to do this.
Of course, we are still collecting numbers after we send the collection to the logger. And let's say there is some overhead in the logging process that delays the actual logging. Hopefully you can see where this is going.
If we store this collection in a mutable val
, (mutable because we are continuously adding to it), this means that the process doing the logging will be looking at the same object that's still being updated by our collection process. That collection may be updated at any time, and so when it's time to log we may not actually be logging the collection we sent.
If we use an immutable var
, we send an immutable data structure to the logger. When we add more numbers to our collection, we will be replacing our var
with a new immutable data structure. This doesn't mean collection sent to the logger is replaced! It's still referencing the collection it was sent. So our logger will indeed log the collection it received.
I think the examples in this blog post will shed more light, as the question of which combo to use becomes even more important in concurrency scenarios: importance of immutability for concurrency. And while we're at it, note the preferred use of synchronised vs @volatile vs something like AtomicReference: three tools
var immutable
vs. val mutable
In addition to many excellent answers to this question. Here is a simple example, that illustrates potential dangers of val mutable
:
Mutable objects can be modified inside methods, that take them as parameters, while reassignment is not allowed.
import scala.collection.mutable.ArrayBuffer
object MyObject {
def main(args: Array[String]) {
val a = ArrayBuffer(1,2,3,4)
silly(a)
println(a) // a has been modified here
}
def silly(a: ArrayBuffer[Int]): Unit = {
a += 10
println(s"length: ${a.length}")
}
}
Result:
length: 5
ArrayBuffer(1, 2, 3, 4, 10)
Something like this cannot happen with var immutable
, because reassignment is not allowed:
object MyObject {
def main(args: Array[String]) {
var v = Vector(1,2,3,4)
silly(v)
println(v)
}
def silly(v: Vector[Int]): Unit = {
v = v :+ 10 // This line is not valid
println(s"length of v: ${v.length}")
}
}
Results in:
error: reassignment to val
Since function parameters are treated as val
this reassignment is not allowed.
mutable val
is not possible with the immutable var
. What here is incorrect?
+=
method like array buffer. Your answers implies that +=
is the same as x = x + y
which it is not. Your statement that function params are treated as vals is correct and you do get the error you mention but only because you used =
. You can get the same error with an ArrayBuffer so the collections mutability here isn't really relevant. So its not a good answer because its not getting at what the OP is talking about. Though it is a good example of the dangers of passing a mutable collection around if you didn't intend to.
ArrayBuffer
, by using Vector
. The OP's question is broad, but they were looking for suggestions on when to use which, so I believe my answer is useful because it illustrates dangers of passing around mutable collection (the fact that is val
does not help); immutable var
is safer than mutable val
.
Success story sharing
immutable val
overimmutable var
overmutable val
overmutable var
. Especiallyimmutable var
overmutable val
!var
. Another nice feature of using immutable collections is that you can efficiently keep old copies around, even if thevar
mutates.var x: Set[Int]
overval x: mutable.Set[Int]
since if you passx
to some other function, in the former case, you are sure, that function cannot mutatex
for you.