ChatGPT解决这个技术问题 Extra ChatGPT

From io.Reader to string in Go

go

I have an io.ReadCloser object (from an http.Response object).

What's the most efficient way to convert the entire stream to a string object?


S
Stephen Weinberg

EDIT:

Since 1.10, strings.Builder exists. Example:

buf := new(strings.Builder)
n, err := io.Copy(buf, r)
// check errors
fmt.Println(buf.String())

OUTDATED INFORMATION BELOW

The short answer is that it it will not be efficient because converting to a string requires doing a complete copy of the byte array. Here is the proper (non-efficient) way to do what you want:

buf := new(bytes.Buffer)
buf.ReadFrom(yourReader)
s := buf.String() // Does a complete copy of the bytes in the buffer.

This copy is done as a protection mechanism. Strings are immutable. If you could convert a []byte to a string, you could change the contents of the string. However, go allows you to disable the type safety mechanisms using the unsafe package. Use the unsafe package at your own risk. Hopefully the name alone is a good enough warning. Here is how I would do it using unsafe:

buf := new(bytes.Buffer)
buf.ReadFrom(yourReader)
b := buf.Bytes()
s := *(*string)(unsafe.Pointer(&b))

There we go, you have now efficiently converted your byte array to a string. Really, all this does is trick the type system into calling it a string. There are a couple caveats to this method:

There are no guarantees this will work in all go compilers. While this works with the plan-9 gc compiler, it relies on "implementation details" not mentioned in the official spec. You can not even guarantee that this will work on all architectures or not be changed in gc. In other words, this is a bad idea. That string is mutable! If you make any calls on that buffer it will change the string. Be very careful.

My advice is to stick to the official method. Doing a copy is not that expensive and it is not worth the evils of unsafe. If the string is too large to do a copy, you should not be making it into a string.


Thanks, that's a really detailed answer. The "good" way seems roughly equivalent to @Sonia's answer too (since buf.String just does the cast internally).
And it doesn't even work with my version, it seems not to be able to get a Pointer from &but.Bytes(). Using Go1.
@sinni800 Thanks for the tip. I forgot function returns were not addressable. It is now fixed.
Well computers are pretty damn fast at copying blocks of bytes. And given this is an http request, I can't imagine a scenario where transmission latency won't be a squillion times larger than the trivial time it takes to copy the byte array. Any functional language copies this type of immutable stuff around all over the place, and still runs plenty fast.
This answer is out-of-date. strings.Builder does this efficiently by ensuring the underlying []byte never leaks, and converting to string without a copy in a way that will be supported going forward. This didn't exist in 2012. @dimchansky's solution below has been the correct one since Go 1.10. Please consider an edit!
a
aymericbeaumet

Answers so far haven't addressed the "entire stream" part of the question. I think the good way to do this is ioutil.ReadAll. With your io.ReaderCloser named rc, I would write,

Go >= v1.16

if b, err := io.ReadAll(rc); err == nil {
    return string(b)
} ...

Go <= v1.15

if b, err := ioutil.ReadAll(rc); err == nil {
    return string(b)
} ...

Thanks, good answer. It looks like buf.ReadFrom() also reads the whole stream up to EOF.
How funny: I just read the implementation of ioutil.ReadAll() and it simply wraps a bytes.Buffer's ReadFrom. And the buffer's String() method is a simple wrap around casting to string – so the two approaches are practically the same!
I did this and it works...the first time. For some reason after reading the string, the sequent reads return an empty string. Not sure why yet.
@Aldo'xoen'Giambelluca ReadAll consumes the reader, so on the next call there is nothing left to read.
@DanneJ I wrote this some time ago: medium.com/@xoen/… Are there any reasons to not do it?
X
Xavi
data, _ := ioutil.ReadAll(response.Body)
fmt.Println(string(data))

佚名

The most efficient way would be to always use []byte instead of string.

In case you need to print data received from the io.ReadCloser, the fmt package can handle []byte, but it isn't efficient because the fmt implementation will internally convert []byte to string. In order to avoid this conversion, you can implement the fmt.Formatter interface for a type like type ByteSlice []byte.


Is the conversion from []byte to string expensive? I assumed string([]byte) didn't actually copy the []byte, but just interpreted the slice elements as a series of runes. That is why I suggested Buffer.String() weekly.golang.org/src/pkg/bytes/buffer.go?s=1787:1819#L37. I guess it would be good to know what is happening when string([]byte) is called.
Conversion from []byte to string is reasonably fast, but the question was asking about "the most efficient way". Currently, the Go run-time will always allocate a new string when converting []byte to string. The reason for this is that the compiler doesn't know how to determine whether the []byte will be modified after the conversion. There is some room for compiler optimizations here.
D
Dimchansky
func copyToString(r io.Reader) (res string, err error) {
    var sb strings.Builder
    if _, err = io.Copy(&sb, r); err == nil {
        res = sb.String()
    }
    return
}

V
Vojtech Vitek
var b bytes.Buffer
b.ReadFrom(r)

// b.String()

N
Nate

I like the bytes.Buffer struct. I see it has ReadFrom and String methods. I've used it with a []byte but not an io.Reader.