You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The output of a decoder produced from Encoding.NewDecoder differs depending on how you chunk the input to it. I noticed these differences:
The decoder may ignore "internal" padding (= characters not at the end of the stream). For example, decoding ["QQ==Qg=="] (correctly) results in an error, but ["QQ==", "Qg=="] (incorrectly) decodes to "AB".
The byte offset in error messages may get reset to 0 instead of indicating the absolute offset in the stream. For example, decoding ["AAAA####"] says the error occurs at offset 4, but decoding ["AAAA" "####"] says the error occurs at offset 0.
I think that the output of a decoder should always be the same as if the entire Reader were serialized to a string and then passed to DecodeString.
Item 1 is more important IMO. Item 2 was unexpected but I can live with inconsistent byte offsets in error messages. However seeing as CorruptInputError is already an int64, it would be nice to have if it doesn't complicate the internals too much.
This bug is somewhat similar to #25296 for encoding/base32.
What version of Go are you using (go version)?
$ go version
go version go1.11.5 linux/amd64
Does this issue reproduce with the latest release?
Yes, using go1.12 on play.golang.org
What operating system and processor architecture are you using (go env)?
go env Output
$ go env
GOARCH="amd64"
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
DecodeString("QQ==Qg==")
"A" illegal base64 data at input byte 4
DecodeString("AAAA####")
"\x00\x00\x00" illegal base64 data at input byte 4
["QQ==Qg=="]
"A" illegal base64 data at input byte 4
["Q" "Q==Qg=="]
"A" illegal base64 data at input byte 4
["QQ==" "Qg=="]
"A" illegal base64 data at input byte 4
["QQ==Qg=" "="]
"A" illegal base64 data at input byte 4
["Q" "Q" "=" "=" "Q" "g" "=" "="]
"A" illegal base64 data at input byte 4
["AAAA####"]
"\x00\x00\x00" illegal base64 data at input byte 4
["AAAA" "####"]
"\x00\x00\x00" illegal base64 data at input byte 4
What did you see instead?
DecodeString("QQ==Qg==")
"A" illegal base64 data at input byte 4
DecodeString("AAAA####")
"\x00\x00\x00" illegal base64 data at input byte 4
["QQ==Qg=="]
"A" illegal base64 data at input byte 4
["Q" "Q==Qg=="]
"A" illegal base64 data at input byte 4
["QQ==" "Qg=="]
"AB" <nil>
["QQ==Qg=" "="]
"AB" <nil>
["Q" "Q" "=" "=" "Q" "g" "=" "="]
"AB" <nil>
["AAAA####"]
"\x00\x00\x00" illegal base64 data at input byte 4
["AAAA" "####"]
"\x00\x00\x00" illegal base64 data at input byte 0
The text was updated successfully, but these errors were encountered:
The bug still exists with 1.14.2. I'm not sure why this issue got the Proposal label; it's just a bug in the base64 package.
$ go version
go version go1.14.2 linux/amd64
Here is another demonstration of the bug. Here, the same input is given to a decoder, each time split differently. The output should always be the same, but it is not. It's not hard to imagine a case where a decoder is reading from a network socket, say, and accepts or rejects an input depending on where packet boundaries happen to fall.
package main
import (
"encoding/base64"
"fmt"
"io"
"io/ioutil"
)
func test(chunks [][]byte) {
pr, pw := io.Pipe()
go func() {
for _, chunk := range chunks {
pw.Write(chunk)
}
pw.Close()
}()
output, err := ioutil.ReadAll(base64.NewDecoder(base64.StdEncoding, pr))
fmt.Printf("%+q -> %+q %v\n", chunks, output, err)
}
func main() {
input := []byte("Rw==bw==")
for i := 0; i < len(input)+1; i++ {
test([][]byte{input[:i], input[i:]})
}
}
["" "Rw==bw=="] -> "G" illegal base64 data at input byte 4
["R" "w==bw=="] -> "G" illegal base64 data at input byte 4
["Rw" "==bw=="] -> "G" illegal base64 data at input byte 4
["Rw=" "=bw=="] -> "G" illegal base64 data at input byte 4
["Rw==" "bw=="] -> "Go" <nil>
["Rw==b" "w=="] -> "Go" <nil>
["Rw==bw" "=="] -> "Go" <nil>
["Rw==bw=" "="] -> "Go" <nil>
["Rw==bw==" ""] -> "G" illegal base64 data at input byte 4
The output of a decoder produced from Encoding.NewDecoder differs depending on how you chunk the input to it. I noticed these differences:
=
characters not at the end of the stream). For example, decoding["QQ==Qg=="]
(correctly) results in an error, but["QQ==", "Qg=="]
(incorrectly) decodes to"AB"
.["AAAA####"]
says the error occurs at offset 4, but decoding["AAAA" "####"]
says the error occurs at offset 0.I think that the output of a decoder should always be the same as if the entire Reader were serialized to a string and then passed to DecodeString.
Item 1 is more important IMO. Item 2 was unexpected but I can live with inconsistent byte offsets in error messages. However seeing as CorruptInputError is already an int64, it would be nice to have if it doesn't complicate the internals too much.
This bug is somewhat similar to #25296 for encoding/base32.
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes, using go1.12 on play.golang.org
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
https://play.golang.org/p/6rcDYtro36S
What did you expect to see?
What did you see instead?
The text was updated successfully, but these errors were encountered: