encoding/base32: decoder output depends on chunking of underlying reader #38657

AxbB36 · 2020-04-25T17:19:53Z

#31626 is about a problem that affects encoding/base64. A similar problem affects encoding/base32.

A decoder created by NewDecoder is sensitive to how its input is split into pieces, and it should not be. Whether the Decoder interprets the input as valid may depend on whether the underlying Reader yields one big byte slice, or two smaller byte slices, for example.

What version of Go are you using (`go version`)?

$ go version
go version go1.14.2 linux/amd64

Does this issue reproduce with the latest release?

Yes, with go1.14.2 on play.golang.org.

What operating system and processor architecture are you using (`go env`)?

go env Output

$ go env
GOARCH="amd64"
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"

What did you do?

https://play.golang.org/p/KV4XPizhYj9

package main

import (
	"encoding/base32"
	"fmt"
	"io"
	"io/ioutil"
)

func test(chunks [][]byte) {
	pr, pw := io.Pipe()
	go func() {
		for _, chunk := range chunks {
			pw.Write(chunk)
		}
		pw.Close()
	}()
	output, err := ioutil.ReadAll(base32.NewDecoder(base32.StdEncoding, pr))
	fmt.Printf("%+q -> %+q %v\n", chunks, output, err)
}

func main() {
	input := []byte("I4======N4======")
	// input := []byte("I5XQ===============")
	for i := 0; i < len(input)+1; i++ {
		test([][]byte{input[:i], input[i:]})
	}
}

You can also try this input to see inconsistent error offsets (goes from index 4, to 0, back to 4):

	input := []byte("I5XQ===============")

What did you expect to see?

The decoder's output should be consistent, regardless of the chunking of the underlying reader.

["" "I4======N4======"] -> "" illegal base32 data at input byte 2
["I" "4======N4======"] -> "" illegal base32 data at input byte 2
["I4" "======N4======"] -> "" illegal base32 data at input byte 2
["I4=" "=====N4======"] -> "" illegal base32 data at input byte 2
["I4==" "====N4======"] -> "" illegal base32 data at input byte 2
["I4===" "===N4======"] -> "" illegal base32 data at input byte 2
["I4====" "==N4======"] -> "" illegal base32 data at input byte 2
["I4=====" "=N4======"] -> "" illegal base32 data at input byte 2
["I4======" "N4======"] -> "" illegal base32 data at input byte 2
["I4======N" "4======"] -> "" illegal base32 data at input byte 2
["I4======N4" "======"] -> "" illegal base32 data at input byte 2
["I4======N4=" "====="] -> "" illegal base32 data at input byte 2
["I4======N4==" "===="] -> "" illegal base32 data at input byte 2
["I4======N4===" "==="] -> "" illegal base32 data at input byte 2
["I4======N4====" "=="] -> "" illegal base32 data at input byte 2
["I4======N4=====" "="] -> "" illegal base32 data at input byte 2
["I4======N4======" ""] -> "" illegal base32 data at input byte 2

What did you see instead?

["" "I4======N4======"] -> "" illegal base32 data at input byte 2
["I" "4======N4======"] -> "" illegal base32 data at input byte 2
["I4" "======N4======"] -> "" illegal base32 data at input byte 2
["I4=" "=====N4======"] -> "" illegal base32 data at input byte 2
["I4==" "====N4======"] -> "" illegal base32 data at input byte 2
["I4===" "===N4======"] -> "" illegal base32 data at input byte 2
["I4====" "==N4======"] -> "" illegal base32 data at input byte 2
["I4=====" "=N4======"] -> "" illegal base32 data at input byte 2
["I4======" "N4======"] -> "Go" <nil>
["I4======N" "4======"] -> "Go" <nil>
["I4======N4" "======"] -> "Go" <nil>
["I4======N4=" "====="] -> "Go" <nil>
["I4======N4==" "===="] -> "Go" <nil>
["I4======N4===" "==="] -> "Go" <nil>
["I4======N4====" "=="] -> "Go" <nil>
["I4======N4=====" "="] -> "Go" <nil>
["I4======N4======" ""] -> "" illegal base32 data at input byte 2

The text was updated successfully, but these errors were encountered:

pelletier197 · 2020-07-29T18:10:36Z

I encountered the same issue with base64 decoder. I sadly have a use-case where I have a gigantic base64 string split in multiple parts, that I put back together and decode using the decoder (for memory issues), but in some cases, the output decoded is missing some bytes.

Had to go with the "load all the string in memory and decode it one shot", which is way more memory consuming.

gopherbot · 2022-05-03T07:09:37Z

Change https://go.dev/cl/403315 mentions this issue: encoding/base32: decoder output depends on chunking of underlying reader

ianlancetaylor added help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Apr 25, 2020

ianlancetaylor added this to the Backlog milestone Apr 25, 2020

seankhliao mentioned this issue Jan 22, 2022

encoding/base32: Stream-based Decoder Might Have a Design Flaw #50754

Closed

teivah mentioned this issue Apr 30, 2022

encoding/base32: decoder output depends on chunking of underlying reader #52631

Closed

gopherbot closed this as completed in 8a5845e May 3, 2022

golang locked and limited conversation to collaborators May 3, 2023

gopherbot added the FrozenDueToAge label May 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

encoding/base32: decoder output depends on chunking of underlying reader #38657

encoding/base32: decoder output depends on chunking of underlying reader #38657

AxbB36 commented Apr 25, 2020

pelletier197 commented Jul 29, 2020

gopherbot commented May 3, 2022

encoding/base32: decoder output depends on chunking of underlying reader #38657

encoding/base32: decoder output depends on chunking of underlying reader #38657

Comments

AxbB36 commented Apr 25, 2020

What version of Go are you using (go version)?

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

What did you do?

What did you expect to see?

What did you see instead?

pelletier197 commented Jul 29, 2020

gopherbot commented May 3, 2022

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?