I often work with slices of structs. Here's an example for such a struct:
type MyStruct struct {
val1, val2, val3 int
text1, text2, text3 string
list []SomeType
}
So I define my slices as follows:
[]MyStruct
Let's say I have about a million elements in there and I'm working heavily with the slice:
I append new elements often. (The total number of elements is unknown.)
I sort it every now and then.
I also delete elements (although not as much as adding new elements).
I read elements often and pass them around (as function arguments).
The content of the elements themselves doesn't get changed.
My understanding is that this leads to a lot of shuffling around of the actual struct. The alternative is to create a slice of pointers to the struct:
[]*MyStruct
Now the structs remain where they are and we only deal with pointers which I assume have a smaller footprint and will therefore make my operations faster. But now I'm giving the garbage collector a lot more work.
Can you provide general guidelines of when to work with structs directly vs. when to work with pointers to structs?
Should I worry about how much work I leave to the GC?
Is the performance overhead of copying a struct vs. copying a pointer negligible?
Maybe a million elements is not much. How does all of this change when the slice gets much bigger (but still fits in RAM, of course)?
[]T
and []*T
--most rehash what folks have said here, but maybe some others factor in (say the concern about holding on to a pointer into a slice after it is reallocated by append
).
Just got curious about this myself. Ran some benchmarks:
type MyStruct struct {
F1, F2, F3, F4, F5, F6, F7 string
I1, I2, I3, I4, I5, I6, I7 int64
}
func BenchmarkAppendingStructs(b *testing.B) {
var s []MyStruct
for i := 0; i < b.N; i++ {
s = append(s, MyStruct{})
}
}
func BenchmarkAppendingPointers(b *testing.B) {
var s []*MyStruct
for i := 0; i < b.N; i++ {
s = append(s, &MyStruct{})
}
}
Results:
BenchmarkAppendingStructs 1000000 3528 ns/op
BenchmarkAppendingPointers 5000000 246 ns/op
Take aways: we're in nanoseconds. Probably negligible for small slices. But for millions of ops, it's the difference between milliseconds and microseconds.
Btw, I tried running the benchmark again with slices which were pre-allocated (with a capacity of 1000000) to eliminate overhead from append() periodically copying the underlying array. Appending structs dropped 1000ns, appending pointers didn't change at all.
Can you provide general guidelines of when to work with structs directly vs. when to work with pointers to structs?
No, it depends too much on all the other factors you've already mentioned.
The only real answer is: benchmark and see. Every case is different and all the theory in the world doesn't make a difference when you've got actual timings to work with.
(That said, my intuition would be to use pointers, and possibly a sync.Pool
to aid the garbage collector: http://golang.org/pkg/sync/#Pool)
Unlike maps, slices, channels, functions, and methods, struct variables are passed by copy which means there's more memory allocated behind the scene. On the other hand, reducing pointers result in less work for the garbage collector. From my perspective, I would think more about 3 things: the struct complexity, the quantity of data to handle, and the functional need once you'd have created your var (does it need to be mutable when it's being passed into a function? etc..)
Success story sharing
BenchmarkAppendingStructs-8 5000000 387 ns/op BenchmarkAppendingPointers-8 3000000 422 ns/op