ChatGPT解决这个技术问题 Extra ChatGPT

What is the best way to test for an empty string in Go?

Which method is best (most idomatic) for testing non-empty strings (in Go)?

if len(mystring) > 0 { }

Or:

if mystring != "" { }

Or something else?


A
ANisus

Both styles are used within the Go's standard libraries.

if len(s) > 0 { ... }

can be found in the strconv package: http://golang.org/src/pkg/strconv/atoi.go

if s != "" { ... }

can be found in the encoding/json package: http://golang.org/src/pkg/encoding/json/encode.go

Both are idiomatic and are clear enough. It is more a matter of personal taste and about clarity.

Russ Cox writes in a golang-nuts thread:

The one that makes the code clear. If I'm about to look at element x I typically write len(s) > x, even for x == 0, but if I care about "is it this specific string" I tend to write s == "". It's reasonable to assume that a mature compiler will compile len(s) == 0 and s == "" into the same, efficient code. ... Make the code clear.

As pointed out in Timmmm's answer, the Go compiler does generate identical code in both cases.


I don't agree with this answer. Simply if mystring != "" { } is the best, preferred and idiomatic way TODAY. The reason standard library contains otherwise is because it was written before 2010 when the len(mystring) == 0 optimization made sense.
@honzajde Just tried to validate your statement, but found commits in the standard library less than 1 year old using len to check empty/non-empty strings. Like this commit by Brad Fitzpatrick. I am afraid it is still a matter of taste and clarity ;)
@honzajde Not trolling. There are 3 len keywords in the commit. I was referring to len(v) > 0 in h2_bundle.go (line 2702). It is not automatically shown as it is generated from golang.org/x/net/http2, I believe.
If it's noi in the diff then it's not new. Why don't you post direct link? Anyways. enough detective work for me... I don't see it.
@honzajde You actually have to expand the diff on that file because it's so large.
z
zzzz

This seems to be premature microoptimization. The compiler is free to produce the same code for both cases or at least for these two

if len(s) != 0 { ... }

and

if s != "" { ... }

because the semantics is clearly equal.


agreed, however, it really depends on the implementation of string... If strings are implemented like pascal then len(s) is executed in o(1) and if like C then it's o(n). or whatever, since len() has to execute to completion.
Have you looked at the code generation to see if the compiler anticipates this or are you only suggesting that a compiler could implement this?
E
Edwinner

Assuming that empty spaces and all leading and trailing white spaces should be removed:

import "strings"
if len(strings.TrimSpace(s)) == 0 { ... }

Because :
len("") // is 0
len(" ") // one empty space is 1
len(" ") // two empty spaces is 2


Why do you have this assumption? The guy clearly tells about the empty string. The same way you can tell, assuming that you want only ascii characters in a string and then add a function that removes all non-ascii chars.
Because len("") , len(" ") and len(" ") are not the same thing in go. I was assuming he wanted to make sure that a variable he had initialized to one of those earlier is indeed still "technically" empty.
This is actually exactly what I needed from this post. I need the user input to have at least 1 non-whitespace character and this one-liner is clear and concise. All I need to do is make the if condition < 1 +1
W
Wilhelm Murdoch

Checking for length is a good answer, but you could also account for an "empty" string that is also only whitespace. Not "technically" empty, but if you care to check:

package main

import (
  "fmt"
  "strings"
)

func main() {
  stringOne := "merpflakes"
  stringTwo := "   "
  stringThree := ""

  if len(strings.TrimSpace(stringOne)) == 0 {
    fmt.Println("String is empty!")
  }

  if len(strings.TrimSpace(stringTwo)) == 0 {
    fmt.Println("String two is empty!")
  }

  if len(stringTwo) == 0 {
    fmt.Println("String two is still empty!")
  }

  if len(strings.TrimSpace(stringThree)) == 0 {
    fmt.Println("String three is empty!")
  }
}

TrimSpace will allocate and copy a new string from the original string, so this approach will introduce inefficiencies at scale.
@Dai looking at the source code, that would be only true if, given s is of type string, s[0:i] returns a new copy. Strings are immutable in Go, so does it need to create a copy here?
@MichaelPaesold Right - strings.TrimSpace( s ) will not cause new string allocation and character copy if the string doesn't need trimming, but if the string does need trimming then the extra copy (without whitespace characters) will be invoked.
"technically empty" is the question.
The gocritic linter suggests using strings.TrimSpace(str) == "" instead of the length check.
T
Timmmm

As of now, the Go compiler generates identical code in both cases, so it is a matter of taste. GCCGo does generate different code, but barely anyone uses it so I wouldn't worry about that.

https://godbolt.org/z/fib1x1


J
Janis Viksne

As per official guidelines and from performance point of view they appear equivalent (ANisus answer), the s != "" would be better due to a syntactical advantage. s != "" will fail at compile time if the variable is not a string, while len(s) == 0 will pass for several other data types.


There was a time when I counted CPU cycles and reviewed the assembler that the C compiler produced and deeply understood the structure of C and Pascal strings... even with all the optimizations in the world len() requires just that little extra bit of work. HOWEVER, one thing we used to do in C was cast the left side to a const or put the static string on the left side of the operator to prevent s=="" from becoming s="" which in the C syntax is acceptable... and probably golang too. (see the extended if)
B
BaCaRoZzo

I think == "" is faster and more readable.

package main 

import(
    "fmt"
)
func main() {
    n := 1
    s:=""
    if len(s)==0{
        n=2
    }
    fmt.Println("%d", n)
}

when dlv debug playground.go cmp with len(s) and =="" I got this s == "" situation

    playground.go:6         0x1008d9d20     810b40f9        MOVD 16(R28), R1       
    playground.go:6         0x1008d9d24     e28300d1        SUB $32, RSP, R2       
    playground.go:6         0x1008d9d28     5f0001eb        CMP R1, R2             
    playground.go:6         0x1008d9d2c     09070054        BLS 56(PC)             
    playground.go:6         0x1008d9d30*    fe0f16f8        MOVD.W R30, -160(RSP)  

    playground.go:6         0x1008d9d34     fd831ff8        MOVD R29, -8(RSP)      
    playground.go:6         0x1008d9d38     fd2300d1        SUB $8, RSP, R29       
    playground.go:7         0x1008d9d3c     e00340b2        ORR $1, ZR, R0         

    playground.go:7         0x1008d9d40     e01f00f9        MOVD R0, 56(RSP)       
    playground.go:8         0x1008d9d44     ff7f05a9        STP (ZR, ZR), 80(RSP)  

    playground.go:9         0x1008d9d48     01000014        JMP 1(PC)                        
    playground.go:10        0x1008d9d4c     e0037fb2        ORR $2, ZR, R0         

len(s)==0 situation

    playground.go:6         0x100761d20     810b40f9        MOVD 16(R28), R1       
    playground.go:6         0x100761d24     e2c300d1        SUB $48, RSP, R2       
    playground.go:6         0x100761d28     5f0001eb        CMP R1, R2             
    playground.go:6         0x100761d2c     29070054        BLS 57(PC)             
    playground.go:6         0x100761d30*    fe0f15f8        MOVD.W R30, -176(RSP)  

    playground.go:6         0x100761d34     fd831ff8        MOVD R29, -8(RSP)      
    playground.go:6         0x100761d38     fd2300d1        SUB $8, RSP, R29       
    playground.go:7         0x100761d3c     e00340b2        ORR $1, ZR, R0         

    playground.go:7         0x100761d40     e02300f9        MOVD R0, 64(RSP)       
    playground.go:8         0x100761d44     ff7f06a9        STP (ZR, ZR), 96(RSP)  
    playground.go:9         0x100761d48     ff2700f9        MOVD ZR, 72(RSP)       

    playground.go:9         0x100761d4c     01000014        JMP 1(PC)              
    playground.go:10        0x100761d50     e0037fb2        ORR $2, ZR, R0         
    playground.go:10        0x100761d54     e02300f9        MOVD R0, 64(RSP)       
    playground.go:10        0x100761d58     01000014        JMP 1(PC)      
    playground.go:6         0x104855d2c     09070054        BLS 56(PC)        

Blockquote


interesting... however, readability depends on whether you THINK in code: "empty string" vs "string of zero length". (I'm not sure how I view it so I will have to catch myself next time). That said, golang treats a string as a slice of runes and I'm not sure if one refers to a slice as empty or zero length. I would imagine that consistency is better. I was once an ASM clock cycle counter but with compiler and runtime optimization I'm not sure counting cycles, jumps, cache invalidation are meaningful anymore. THANKS this brings back memories.
I
Ioannis Sermetziadis

It would be cleaner and less error-prone to use a function like the one below:

func empty(s string) bool {
    return len(strings.TrimSpace(s)) == 0
}

M
Markus Linnala

Just to add more to comment

Mainly about how to do performance testing.

I did testing with following code:

import (
    "testing"
)

var ss = []string{"Hello", "", "bar", " ", "baz", "ewrqlosakdjhf12934c r39yfashk fjkashkfashds fsdakjh-", "", "123"}

func BenchmarkStringCheckEq(b *testing.B) {
    c := 0
    b.ResetTimer()
    for n := 0; n < b.N; n++ {
            for _, s := range ss {
                    if s == "" {
                            c++
                    }
            }
    } 
    t := 2 * b.N
    if c != t {
            b.Fatalf("did not catch empty strings: %d != %d", c, t)
    }
}
func BenchmarkStringCheckLen(b *testing.B) {
    c := 0
    b.ResetTimer()
    for n := 0; n < b.N; n++ {
            for _, s := range ss { 
                    if len(s) == 0 {
                            c++
                    }
            }
    } 
    t := 2 * b.N
    if c != t {
            b.Fatalf("did not catch empty strings: %d != %d", c, t)
    }
}
func BenchmarkStringCheckLenGt(b *testing.B) {
    c := 0
    b.ResetTimer()
    for n := 0; n < b.N; n++ {
            for _, s := range ss {
                    if len(s) > 0 {
                            c++
                    }
            }
    } 
    t := 6 * b.N
    if c != t {
            b.Fatalf("did not catch empty strings: %d != %d", c, t)
    }
}
func BenchmarkStringCheckNe(b *testing.B) {
    c := 0
    b.ResetTimer()
    for n := 0; n < b.N; n++ {
            for _, s := range ss {
                    if s != "" {
                            c++
                    }
            }
    } 
    t := 6 * b.N
    if c != t {
            b.Fatalf("did not catch empty strings: %d != %d", c, t)
    }
}

And results were:

% for a in $(seq 50);do go test -run=^$ -bench=. --benchtime=1s ./...|grep Bench;done | tee -a log
% sort -k 3n log | head -10

BenchmarkStringCheckEq-4        150149937            8.06 ns/op
BenchmarkStringCheckLenGt-4     147926752            8.06 ns/op
BenchmarkStringCheckLenGt-4     148045771            8.06 ns/op
BenchmarkStringCheckNe-4        145506912            8.06 ns/op
BenchmarkStringCheckLen-4       145942450            8.07 ns/op
BenchmarkStringCheckEq-4        146990384            8.08 ns/op
BenchmarkStringCheckLenGt-4     149351529            8.08 ns/op
BenchmarkStringCheckNe-4        148212032            8.08 ns/op
BenchmarkStringCheckEq-4        145122193            8.09 ns/op
BenchmarkStringCheckEq-4        146277885            8.09 ns/op

Effectively variants usually do not reach fastest time and there is only minimal difference (about 0.01ns/op) between variant top speed.

And if I look full log, difference between tries is greater than difference between benchmark functions.

Also there does not seem to be any measurable difference between BenchmarkStringCheckEq and BenchmarkStringCheckNe or BenchmarkStringCheckLen and BenchmarkStringCheckLenGt even if latter variants should inc c 6 times instead of 2 times.

You can try to get some confidence about equal performance by adding tests with modified test or inner loop. This is faster:

func BenchmarkStringCheckNone4(b *testing.B) {
    c := 0
    b.ResetTimer()
    for n := 0; n < b.N; n++ {
            for _, _ = range ss {
                    c++
            }
    }
    t := len(ss) * b.N
    if c != t {
            b.Fatalf("did not catch empty strings: %d != %d", c, t)
    }
}

This is not faster:

func BenchmarkStringCheckEq3(b *testing.B) {
    ss2 := make([]string, len(ss))
    prefix := "a"
    for i, _ := range ss {
            ss2[i] = prefix + ss[i]
    }
    c := 0
    b.ResetTimer()
    for n := 0; n < b.N; n++ {
            for _, s := range ss2 {
                    if s == prefix {
                            c++
                    }
            }
    }
    t := 2 * b.N
    if c != t {
            b.Fatalf("did not catch empty strings: %d != %d", c, t)
    }
}

Both variants are usually faster or slower than difference between main tests.

It would also good to generate test strings (ss) using string generator with relevant distribution. And have variable lengths too.

So I don't have any confidence of performance difference between main methods to test empty string in go.

And I can state with some confidence, it is faster not to test empty string at all than test empty string. And also it is faster to test empty string than to test 1 char string (prefix variant).


B
Brian Leishman

This would be more performant than trimming the whole string, since you only need to check for at least a single non-space character existing

// Strempty checks whether string contains only whitespace or not
func Strempty(s string) bool {
    if len(s) == 0 {
        return true
    }

    r := []rune(s)
    l := len(r)

    for l > 0 {
        l--
        if !unicode.IsSpace(r[l]) {
            return false
        }
    }

    return true
}

@Richard that may be, but when Googling for "golang check if string is blank" or things similar, this is the only question that comes up, so for those people this is for them, which isn't an unprecedented thing to do on Stack Exchange
K
Ketan Parmar

I think the best way is to compare with blank string

BenchmarkStringCheck1 is checking with blank string

BenchmarkStringCheck2 is checking with len zero

I check with the empty and non-empty string checking. You can see that checking with a blank string is faster.

BenchmarkStringCheck1-4     2000000000           0.29 ns/op        0 B/op          0 allocs/op
BenchmarkStringCheck1-4     2000000000           0.30 ns/op        0 B/op          0 allocs/op


BenchmarkStringCheck2-4     2000000000           0.30 ns/op        0 B/op          0 allocs/op
BenchmarkStringCheck2-4     2000000000           0.31 ns/op        0 B/op          0 allocs/op

Code

func BenchmarkStringCheck1(b *testing.B) {
    s := "Hello"
    b.ResetTimer()
    for n := 0; n < b.N; n++ {
        if s == "" {

        }
    }
}

func BenchmarkStringCheck2(b *testing.B) {
    s := "Hello"
    b.ResetTimer()
    for n := 0; n < b.N; n++ {
        if len(s) == 0 {

        }
    }
}

I think this proof nothing. Since your computer do other things when testing and difference is to small to say one is faster from another. This could hint that both functions was compiled to the same call.