I'm new to Go and I'm experiencing a bit of cognitive dissonance between C-style stack-based programming where automatic variables live on the stack and allocated memory lives on the heap and Python-style stack-based-programming where the only thing that lives on the stack are references/pointers to objects on the heap.
As far as I can tell, the two following functions give the same output:
func myFunction() (*MyStructType, error) {
var chunk *MyStructType = new(HeaderChunk)
...
return chunk, nil
}
func myFunction() (*MyStructType, error) {
var chunk MyStructType
...
return &chunk, nil
}
i.e., allocate a new struct and return it.
If I'd written that in C, the first one would have put an object on the heap and the second would have put it on the stack. The first would return a pointer to the heap, the second would return a pointer to the stack, which would have evaporated by the time the function had returned, which would be a Bad Thing.
If I'd written it in Python (or many other modern languages except C#) example 2 would not have been possible.
I get that Go garbage collects both values, so both of the above forms are fine.
To quote:
Note that, unlike in C, it's perfectly OK to return the address of a local variable; the storage associated with the variable survives after the function returns. In fact, taking the address of a composite literal allocates a fresh instance each time it is evaluated, so we can combine these last two lines. http://golang.org/doc/effective_go.html#functions
But it raises a couple of questions.
In example 1, the struct is declared on the heap. What about example 2? Is that declared on the stack in the same way it would be in C or does it go on the heap too? If example 2 is declared on the stack, how does it stay available after the function returns? If example 2 is actually declared on the heap, how is it that structs are passed by value rather than by reference? What's the point of pointers in this case?
It's worth noting that the words "stack" and "heap" do not appear anywhere in the language spec. Your question is worded with "...is declared on the stack," and "...declared on the heap," but note that Go declaration syntax says nothing about stack or heap.
That technically makes the answer to all of your questions implementation dependent. In actuality of course, there is a stack (per goroutine!) and a heap and some things go on the stack and some on the heap. In some cases the compiler follows rigid rules (like "new
always allocates on the heap") and in others the compiler does "escape analysis" to decide if an object can live on the stack or if it must be allocated on the heap.
In your example 2, escape analysis would show the pointer to the struct escaping and so the compiler would have to allocate the struct. I think the current implementation of Go follows a rigid rule in this case however, which is that if the address is taken of any part of a struct, the struct goes on the heap.
For question 3, we risk getting confused about terminology. Everything in Go is passed by value, there is no pass by reference. Here you are returning a pointer value. What's the point of pointers? Consider the following modification of your example:
type MyStructType struct{}
func myFunction1() (*MyStructType, error) {
var chunk *MyStructType = new(MyStructType)
// ...
return chunk, nil
}
func myFunction2() (MyStructType, error) {
var chunk MyStructType
// ...
return chunk, nil
}
type bigStruct struct {
lots [1e6]float64
}
func myFunction3() (bigStruct, error) {
var chunk bigStruct
// ...
return chunk, nil
}
I modified myFunction2 to return the struct rather than the address of the struct. Compare the assembly output of myFunction1 and myFunction2 now,
--- prog list "myFunction1" ---
0000 (s.go:5) TEXT myFunction1+0(SB),$16-24
0001 (s.go:6) MOVQ $type."".MyStructType+0(SB),(SP)
0002 (s.go:6) CALL ,runtime.new+0(SB)
0003 (s.go:6) MOVQ 8(SP),AX
0004 (s.go:8) MOVQ AX,.noname+0(FP)
0005 (s.go:8) MOVQ $0,.noname+8(FP)
0006 (s.go:8) MOVQ $0,.noname+16(FP)
0007 (s.go:8) RET ,
--- prog list "myFunction2" ---
0008 (s.go:11) TEXT myFunction2+0(SB),$0-16
0009 (s.go:12) LEAQ chunk+0(SP),DI
0010 (s.go:12) MOVQ $0,AX
0011 (s.go:14) LEAQ .noname+0(FP),BX
0012 (s.go:14) LEAQ chunk+0(SP),BX
0013 (s.go:14) MOVQ $0,.noname+0(FP)
0014 (s.go:14) MOVQ $0,.noname+8(FP)
0015 (s.go:14) RET ,
Don't worry that myFunction1 output here is different than in peterSO's (excellent) answer. We're obviously running different compilers. Otherwise, see that I modfied myFunction2 to return myStructType rather than *myStructType. The call to runtime.new is gone, which in some cases would be a good thing. Hold on though, here's myFunction3,
--- prog list "myFunction3" ---
0016 (s.go:21) TEXT myFunction3+0(SB),$8000000-8000016
0017 (s.go:22) LEAQ chunk+-8000000(SP),DI
0018 (s.go:22) MOVQ $0,AX
0019 (s.go:22) MOVQ $1000000,CX
0020 (s.go:22) REP ,
0021 (s.go:22) STOSQ ,
0022 (s.go:24) LEAQ chunk+-8000000(SP),SI
0023 (s.go:24) LEAQ .noname+0(FP),DI
0024 (s.go:24) MOVQ $1000000,CX
0025 (s.go:24) REP ,
0026 (s.go:24) MOVSQ ,
0027 (s.go:24) MOVQ $0,.noname+8000000(FP)
0028 (s.go:24) MOVQ $0,.noname+8000008(FP)
0029 (s.go:24) RET ,
Still no call to runtime.new, and yes it really works to return an 8MB object by value. It works, but you usually wouldn't want to. The point of a pointer here would be to avoid pushing around 8MB objects.
type MyStructType struct{}
func myFunction1() (*MyStructType, error) {
var chunk *MyStructType = new(MyStructType)
// ...
return chunk, nil
}
func myFunction2() (*MyStructType, error) {
var chunk MyStructType
// ...
return &chunk, nil
}
In both cases, current implementations of Go would allocate memory for a struct
of type MyStructType
on a heap and return its address. The functions are equivalent; the compiler asm source is the same.
--- prog list "myFunction1" ---
0000 (temp.go:9) TEXT myFunction1+0(SB),$8-12
0001 (temp.go:10) MOVL $type."".MyStructType+0(SB),(SP)
0002 (temp.go:10) CALL ,runtime.new+0(SB)
0003 (temp.go:10) MOVL 4(SP),BX
0004 (temp.go:12) MOVL BX,.noname+0(FP)
0005 (temp.go:12) MOVL $0,AX
0006 (temp.go:12) LEAL .noname+4(FP),DI
0007 (temp.go:12) STOSL ,
0008 (temp.go:12) STOSL ,
0009 (temp.go:12) RET ,
--- prog list "myFunction2" ---
0010 (temp.go:15) TEXT myFunction2+0(SB),$8-12
0011 (temp.go:16) MOVL $type."".MyStructType+0(SB),(SP)
0012 (temp.go:16) CALL ,runtime.new+0(SB)
0013 (temp.go:16) MOVL 4(SP),BX
0014 (temp.go:18) MOVL BX,.noname+0(FP)
0015 (temp.go:18) MOVL $0,AX
0016 (temp.go:18) LEAL .noname+4(FP),DI
0017 (temp.go:18) STOSL ,
0018 (temp.go:18) STOSL ,
0019 (temp.go:18) RET ,
Calls In a function call, the function value and arguments are evaluated in the usual order. After they are evaluated, the parameters of the call are passed by value to the function and the called function begins execution. The return parameters of the function are passed by value back to the calling function when the function returns.
All function and return parameters are passed by value. The return parameter value with type *MyStructType
is an address.
According to Go's FAQ:
if the compiler cannot prove that the variable is not referenced after the function returns, then the compiler must allocate the variable on the garbage-collected heap to avoid dangling pointer errors.
You don't always know if your variable is allocated on the stack or heap. ... If you need to know where your variables are allocated pass the "-m" gc flag to "go build" or "go run" (e.g., go run -gcflags -m app.go).
Source: http://devs.cloudimmunity.com/gotchas-and-common-mistakes-in-go-golang/index.html#stack_heap_vars
func Function1() (*MyStructType, error) {
var chunk *MyStructType = new(HeaderChunk)
...
return chunk, nil
}
func Function2() (*MyStructType, error) {
var chunk MyStructType
...
return &chunk, nil
}
Function1 and Function2 may be inline function. And return variable will not escape. It's not necessary to allocate variable on the heap.
My example code:
package main
type S struct {
x int
}
func main() {
F1()
F2()
F3()
}
func F1() *S {
s := new(S)
return s
}
func F2() *S {
s := S{x: 10}
return &s
}
func F3() S {
s := S{x: 9}
return s
}
According to output of cmd:
go run -gcflags -m test.go
output:
# command-line-arguments
./test.go:13:6: can inline F1
./test.go:18:6: can inline F2
./test.go:23:6: can inline F3
./test.go:7:6: can inline main
./test.go:8:4: inlining call to F1
./test.go:9:4: inlining call to F2
./test.go:10:4: inlining call to F3
/var/folders/nr/lxtqsz6x1x1gfbyp1p0jy4p00000gn/T/go-build333003258/b001/_gomod_.go:6:6: can inline init.0
./test.go:8:4: main new(S) does not escape
./test.go:9:4: main &s does not escape
./test.go:14:10: new(S) escapes to heap
./test.go:20:9: &s escapes to heap
./test.go:19:2: moved to heap: s
If the compiler is smart enough, F1() F2() F3() may not be called. Because it makes no means.
Don't care about whether a variable is allocated on heap or stack, just use it. Protect it by mutex or channel if necessary.
//go:noinline
before a function to prevent inlining to test out the code. Question is really more of clarification of a concept in case compiler doesn't opt for inlining.
Success story sharing
new
actually always allocate on the heap?runtime.new
? Also, why is it good ifruntime.new
is gone?