ChatGPT解决这个技术问题 Extra ChatGPT

How can I convert a zero-terminated byte array to string?

go

I need to read [100]byte to transfer a bunch of string data.

Because not all of the strings are precisely 100 characters long, the remaining part of the byte array is padded with 0s.

If I convert [100]byte to string by: string(byteArray[:]), the tailing 0s are displayed as ^@^@s.

In C, the string will terminate upon 0, so what's the best way to convert this byte array to string in Go?

@AndréLaszlo: In the playground the ^@ doesn't show, but it would've been there if you'd test it in the terminal or something similar. The reason for this, is that Go does not stop converting the bytes array to a string when it finds a 0. len(string(bytes)) in your example is 5 and not 1. It depends on the output function, whether the string is fully (with zeros) printed or not.
For the http response body, use string(body).

D
Deleplace

Methods that read data into byte slices return the number of bytes read. You should save that number and then use it to create your string. If n is the number of bytes read, your code would look like this:

s := string(byteArray[:n])

To convert the full string, this can be used:

s := string(byteArray[:len(byteArray)])

This is equivalent to:

s := string(byteArray[:])

If for some reason you don't know n, you could use the bytes package to find it, assuming your input doesn't have a null character embedded in it.

n := bytes.Index(byteArray[:], []byte{0})

Or as icza pointed out, you can use the code below:

n := bytes.IndexByte(byteArray[:], 0)

I know I'm a year late, but I should mention that most methods return the number of bytes read. For instance, binary.Read() can read into a [32]byte, but you don't know whether you've filled all 32 bytes or not.
You should use bytes.IndexByte() which searches for a single byte instead of bytes.Index() with a byte slice containing 1 byte.
actually string(byteArray) will do too and will save a slice creation
Just to be clear though, this is casting a sequence of bytes to something that is hopefully a valid UTF-8 string (and not say, Latin-1 etc., or some malformed UTF-8 sequence). Go will not check this for you when you cast.
@CameronKerr From blog.golang.org/strings: "It's important to state right up front that a string holds arbitrary bytes. It is not required to hold Unicode text, UTF-8 text, or any other predefined format. As far as the content of a string is concerned, it is exactly equivalent to a slice of bytes."
P
Peter Mortensen

Use:

s := string(byteArray[:])

the question specifically says that string(byteArray[:]) contains ^@ characters
What's the difference to string(byteArray)? Why you need to copy the array using [:]?
@RobertZaremba > a string is in effect a read-only slice of bytes. You can't convert byte array directly to string so first slice then string.
@RobertZaremba For byte slices you don't need to add the [:], for byte arrays, you do.
Please read the question. It explicitly states that this does not terminate the string (as C would with say a strcpy) at the first null. It is very annoying that this incorrect answer has been so upvoted. The best answer IMO, is that mentioned above, with the use of IndexByte().
m
marcusljx

Simplistic solution:

str := fmt.Sprintf("%s", byteArray)

I'm not sure how performant this is though.


Unfortunately, this does not remove the trailing zeroes. str has length 100. See play.golang.org/p/XWrmqCbIwkB
Doesn't work, don't know why many upvote this answer
p
peterSO

For example,

package main

import "fmt"

func CToGoString(c []byte) string {
    n := -1
    for i, b := range c {
        if b == 0 {
            break
        }
        n = i
    }
    return string(c[:n+1])
}

func main() {
    c := [100]byte{'a', 'b', 'c'}
    fmt.Println("C: ", len(c), c[:4])
    g := CToGoString(c[:])
    fmt.Println("Go:", len(g), g)
}

Output:

C:  100 [97 98 99 0]
Go: 3 abc

T
T0xicCode

The following code is looking for '\0', and under the assumptions of the question the array can be considered sorted since all non-'\0' precede all '\0'. This assumption won't hold if the array can contain '\0' within the data.

Find the location of the first zero-byte using a binary search, then slice.

You can find the zero-byte like this:

package main

import "fmt"

func FirstZero(b []byte) int {
    min, max := 0, len(b)
    for {
        if min + 1 == max { return max }
        mid := (min + max) / 2
        if b[mid] == '\000' {
            max = mid
        } else {
            min = mid
        }
    }
    return len(b)
}
func main() {
    b := []byte{1, 2, 3, 0, 0, 0}
    fmt.Println(FirstZero(b))
}

It may be faster just to naively scan the byte array looking for the zero-byte, especially if most of your strings are short.


Your code doesn't compile and, even if it did, it won't work. A binary search algorithm finds the position of a specified value within a sorted array. The array is not necessarily sorted.
@peterSO You are right, and in fact it is never sorted since it represents a bunch of meaningful names.
If all the null bytes are at the end of the string a binary search works.
I don't understand the downvotes. The code compiles and is correct, assuming the string contains no \0 except at the end. The code's looking for \0, and under the assumptions of the question the array can be considered 'sorted', since all non-\0 precede all \0 and that's all the code is checking. If downvoters could find an example input on which the code doesn't work, then I'll remove the answer.
Gives wrong result if input is []byte{0}. In this case FirstZero() should return 0 so when slicing result would be "", but instead it returns 1 and slicing results in "\x00".
z
zach

When you do not know the exact length of non-nil bytes in the array, you can trim it first:

string(bytes.Trim(arr, "\x00"))


a) bytes.Trim takes a slice, not an array (you'd need arr[:] if arr is actually a [100]byte as the question states). b) bytes.Trim is the wrong function to use here. For input like []byte{0,0,'a','b','c',0,'d',0} it will return "abc\x00d" instead of "" c) there already is a correct answer that uses bytes.IndexByte, the best way to find the first zero byte.
L
Laevus Dexter

Only use for performance tuning.

package main

import (
    "fmt"
    "reflect"
    "unsafe"
)

func BytesToString(b []byte) string {
    return *(*string)(unsafe.Pointer(&b))
}

func StringToBytes(s string) []byte {
    return *(*[]byte)(unsafe.Pointer(&s))
}

func main() {
    b := []byte{'b', 'y', 't', 'e'}
    s := BytesToString(b)
    fmt.Println(s)
    b = StringToBytes(s)
    fmt.Println(string(b))
}

-1: Not sure if this is a serious answer, but you almost definitely do not want to invoke reflection and unsafe code just to convert a byte slice to string
A word of warning: using unsafe to convert a byte slice to a string may have serious implications if later the byte slice is modified. string values in Go are defined to be immutable, to which the entire Go runtime and libraries build on. You will teleport yourself into the middle of the most mysterious bugs and runtime errors if you go down this path.
Edited, because this is against pointer usage (it has same behavior as direct casting, in the other words result will be not garbage collected). Read the paragraph (6) golang.org/pkg/unsafe/#Pointer
P
Peter Mortensen

Use this:

bytes.NewBuffer(byteArray).String()

Because a) the question says an array so at you'd need byteArray[:] since bytes.NewBuffer takes a []byte; b) the question said the array has trailing zeros that you don't deal with; c) if instead your variable is a []byte (the only way your line will compile) then your line is just a slow way of doing string(v).
P
Peter Mortensen

Though not extremely performant, the only readable solution is:

  // Split by separator and pick the first one.
  // This has all the characters till null, excluding null itself.
  retByteArray := bytes.Split(byteArray[:], []byte{0}) [0]

  // OR

  // If you want a true C-like string, including the null character
  retByteArray := bytes.SplitAfter(byteArray[:], []byte{0}) [0]

A full example to have a C-style byte array:

package main

import (
    "bytes"
    "fmt"
)

func main() {
    var byteArray = [6]byte{97,98,0,100,0,99}

    cStyleString := bytes.SplitAfter(byteArray[:], []byte{0}) [0]
    fmt.Println(cStyleString)
}

A full example to have a Go style string excluding the nulls:

package main

import (
    "bytes"
    "fmt"
)

func main() {
    var byteArray = [6]byte{97, 98, 0, 100, 0, 99}

    goStyleString := string(bytes.Split(byteArray[:], []byte{0}) [0])
    fmt.Println(goStyleString)
}

This allocates a slice of slice of bytes. So keep an eye on performance if it is used heavily or repeatedly.


P
Peter Mortensen

Use slices instead of arrays for reading. For example, io.Reader accepts a slice, not an array.

Use slicing instead of zero padding.

Example:

buf := make([]byte, 100)
n, err := myReader.Read(buf)
if n == 0 && err != nil {
    log.Fatal(err)
}

consume(buf[:n]) // consume() will see an exact (not padded) slice of read data

The data are written by others and by other C language, and I only got to read it, so I cannot control the way it is written.
Oh, then slice the byte array using a length value s := a[:n] or s := string(a[:n]) if you need a string. If n is not directly available it must be computed, e.g. by looking for a specific/zero byte in the buffer (array) as Daniel suggests.
Z
Zombo

Here is an option that removes the null bytes:

package main
import "golang.org/x/sys/windows"

func main() {
   b := []byte{'M', 'a', 'r', 'c', 'h', 0}
   s := windows.ByteSliceToString(b)
   println(s == "March")
}

https://pkg.go.dev/golang.org/x/sys/unix#ByteSliceToString

https://pkg.go.dev/golang.org/x/sys/windows#ByteSliceToString


This seems to only work in Windows. In Linux, I get build constraints exclude all Go files in...

关注公众号,不定期副业成功案例分享
Follow WeChat

Success story sharing

Want to stay one step ahead of the latest teleworks?

Subscribe Now