Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/text/unicode: Panic on a particular string #20079

Closed
albertito opened this issue Apr 22, 2017 · 3 comments
Closed

x/text/unicode: Panic on a particular string #20079

albertito opened this issue Apr 22, 2017 · 3 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@albertito
Copy link
Contributor

When running go-fuzz against a personal package, I found a string that causes
precis / unicode to panic.

What version of Go are you using (go version)?

go version go1.8 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/redacted/go/workspace/"
GORACE=""
GOROOT="/home/redacted/go/go"
GOTOOLDIR="/home/redacted/go/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build130217842=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"

What did you do?

Run go-fuzz against one of my packages, and found the following input that
causes x/text/unicode to panic.

Just in case, I updated the package, and is now at commit 9a820217f98f7c8a207ec1e45a874e1fe12c478 ("internal/catmsg: added varint implementation").

The test calls x/text/secure/precis (as that's how I found it), which then calls x/text/unicode, which panics.

I have not yet investigated further or reduced the input; I'll post updates if I do and get something useful.

Test program

(also at https://play.golang.org/p/3X0GsJQLb-):

package main

import (
    "fmt"

    "golang.org/x/text/secure/precis"
)

func test(s string) {
    fmt.Printf("%q\n", s)
    fmt.Printf("   == %+q\n", s)
    n, err := precis.UsernameCaseMapped.String(s)
    fmt.Printf("   -> %q\n", n)
    fmt.Printf("      %v\n", err)
    fmt.Println()
}

func main() {
    s := "\xd3D\xeb\xcd\x84\xdb\x98\xcc\xb4\xcd\x84\xdb" +
        "\x98\xcc\xb4\xcd\x84\xdb\x98\xcc\xb4\xcd\x84\xdb" +
        "\x98\xcc\xb4\xcd\x84\xdb\x98\xcc\xb4\xcd\x84\xdb" +
        "\x98\xcc\xb4\xcd\x84\xdb\x98\xcc\xb4\xcd\x84\xf4" +
        "\xd3N@\x00\x00t"
    test(s)
}

Output

"\xd3D\xeb\xf4\xd3N@\x00\x00t"
   == "\xd3D\xeb\u0344\u06d8\u0334\u0344\u06d8\u0334\u0344\u06d8\u0334\u0344\u06d8\u0334\u0344\u06d8\u0334\u0344\u06d8\u0334\u0344\u06d8\u0334\u0344\xf4\xd3N@\x00\x00t"
panic: runtime error: index out of range

goroutine 1 [running]:
golang.org/x/text/unicode/norm.(*reorderBuffer).insertOrdered(0xc42009e000, 0x990132320280)
        /home/redacted/go/workspace/src/golang.org/x/text/unicode/norm/composition.go:201 +0xa0
golang.org/x/text/unicode/norm.(*reorderBuffer).insertDecomposed(0xc42009e000, 0x57d852, 0x4, 0x2f, 0x4991c6)
        /home/redacted/go/workspace/src/golang.org/x/text/unicode/norm/composition.go:260 +0x1f1
golang.org/x/text/unicode/norm.(*reorderBuffer).insertFlush(0xc42009e000, 0x0, 0x0, 0xc42004e232, 0x34, 0x49, 0x2b, 0x4a71160232320200, 0x0)
        /home/redacted/go/workspace/src/golang.org/x/text/unicode/norm/composition.go:224 +0x9d
golang.org/x/text/unicode/norm.decomposeSegment(0xc42009e000, 0x2b, 0x1, 0x60)
        /home/redacted/go/workspace/src/golang.org/x/text/unicode/norm/normalize.go:542 +0x2a1
golang.org/x/text/unicode/norm.Form.transform(0x0, 0xc42004e282, 0x49, 0x49, 0xc42004e232, 0x34, 0x49, 0x1, 0x1, 0x0, ...)
        /home/redacted/go/workspace/src/golang.org/x/text/unicode/norm/transform.go:62 +0x25b
golang.org/x/text/unicode/norm.Form.Transform(0x0, 0xc42004e282, 0x49, 0x49, 0xc42004e232, 0x34, 0x49, 0x1, 0x4a71160232320201, 0x8f442c242003dae8, ...)
        /home/redacted/go/workspace/src/golang.org/x/text/unicode/norm/transform.go:33 +0x32c
golang.org/x/text/unicode/norm.(*Form).Transform(0xc42000e488, 0xc42004e282, 0x49, 0x49, 0xc42004e232, 0x34, 0x49, 0x4b6301, 0x4cb940, 0x4ae901, ...)
        <autogenerated>:41 +0xa8
golang.org/x/text/transform.doAppend(0x7f55017a40a0, 0xc42000e488, 0x2, 0xc42004e280, 0x4b, 0x4b, 0xc42004e232, 0x34, 0x49, 0x50, ...)
        /home/redacted/go/workspace/src/golang.org/x/text/transform/transform.go:692 +0x125
golang.org/x/text/transform.Append(0x7f55017a40a0, 0xc42000e488, 0xc42004e280, 0x2, 0x4b, 0xc42004e232, 0x34, 0x49, 0xc42004e230, 0x36, ...)
        /home/redacted/go/workspace/src/golang.org/x/text/transform/transform.go:685 +0xa9
golang.org/x/text/secure/precis.(*buffers).apply(0xc42003dde8, 0x557cc0, 0xc42000e488, 0x0, 0x0)
        /home/redacted/go/workspace/src/golang.org/x/text/secure/precis/profile.go:106 +0x25e
golang.org/x/text/secure/precis.(*buffers).enforce(0xc42003dde8, 0x586e20, 0xc420016640, 0x36, 0x40, 0x0, 0xc42003de28, 0x476035, 0xc4200940c0, 0xc42009c000, ...)
        /home/redacted/go/workspace/src/golang.org/x/text/secure/precis/profile.go:182 +0x588
golang.org/x/text/secure/precis.processString(0x586e20, 0x4ee5ae, 0x36, 0x0, 0x1, 0xa9, 0x0, 0x0)
        /home/redacted/go/workspace/src/golang.org/x/text/secure/precis/profile.go:250 +0xba
golang.org/x/text/secure/precis.(*Profile).String(0x586e20, 0x4ee5ae, 0x36, 0x1, 0x1, 0xa9, 0x0)
        /home/redacted/go/workspace/src/golang.org/x/text/secure/precis/profile.go:259 +0x44
main.test(0x4ee5ae, 0x36)
        /home/redacted/go/workspace/src/tmp/normalize.go:12 +0x1a7
main.main()
        /home/redacted/go/workspace/src/tmp/normalize.go:24 +0x36
exit status 2
@gopherbot gopherbot added this to the Unreleased milestone Apr 22, 2017
@SamWhited
Copy link
Member

SamWhited commented Apr 22, 2017

This is an issue with the combining step of normalization; here's a program derived from the above:

package main

import "golang.org/x/text/unicode/norm"

func main() {
	s := "\xeb\xcd\x84\xdb\x98\xcc\xb4\xcd\x84\xdb" +
		"\x98\xcc\xb4\xcd\x84\xdb\x98\xcc\xb4\xcd\x84\xdb" +
		"\x98\xcc\xb4\xcd\x84\xdb\x98\xcc\xb4\xcd\x84\xdb" +
		"\x98\xcc\xb4\xcd\x84\xdb\x98\xcc\xb4\xcd\x84"
	norm.NFC.String(s)
}

Bisecting between the current HEAD (golang/text@a9a8202) and golang/text@5ff64564 shows the following commit as the first one with the panic:

213f46580e8b85815e5c7e67eb2ceab674f8337b is the first bad commit
commit 213f46580e8b85815e5c7e67eb2ceab674f8337b
Author: Marcel van Lohuizen <mpvl@golang.org>
Date:   Fri Dec 20 16:37:05 2013 +0100

    go.text/unicode/norm: implemented CGJ insertion.
    This is the third and last CL in a series.

    Overview:
      All normalization methods now use the streamSafe type to keep track of
      the number of non-starter runes in a segment. The aim is that they
      all detect overflow at exactly the same point and that they do so in
      accordance with the Stream-Safe Text Processing definition in
      Unicode TR15.
      The normalization relies on streamSafe to count and break correctly
      and hence there is less checking for overflow in the code.

    Tables
      - Now hold nTrailingNonStarters for Hangul. This simplifies the code
        and avoids an expensive check for isHangul and isHangulWithoutJamoT
            at critical parts of the code. combinesForward is now correctly set
            for Hangul as well, as this comes at no additional cost.
            Total table size increase as a result of this addition is about 500b.
      - Added an extra decomposition category firstStarterWithNLead to handle
        getting the nLead values for FF9E|F for NFC. These are, in fact, not
            decompositions, but treating them as such allows us handle these
            runes in a piece of code that is rarely called in the first place
            and at almost no additional cost. compInfo detects the case and
            unmarks the Properties as a decomposition, but setting nLead
            appropriately.
      - We treat starters that combine backwards as non-starters. To this end,
        we set the nTrail to the number of runes of such runes (can be 2!).
            Note that for all runes that combine backwards nLead == nTrail.

    Testing
      As a result of this change, all normalization techniques handle overflow
      exactly the same. We enforce this by letting each method pass the tests
      of all other methods, for so far reasonable. Most tests have been moved
      to appendTests and runNormTests have been added.

    Naming and comments:
      - maxCombiningChars is now maxNonStarters for consistency.
      - Referring to non-starters as rune that do not start a new segment.
      - Renamed skipNonStarters to skipContinuationBytes to avoid confusion.

    Details:
      - Iteration over Hangul had to be changed to allow for non-Korean
        modifiers on Hangul to be put int he same segment. Not very useful,
            but it is more according the spec of Iter.

    R=r
    CC=golang-dev
    https://golang.org/cl/42350044

:040000 040000 1781d734dd7a426f9db3b6580688d92a9c3a213b bd7b015e98c7e9aaf4f9ceacc82bfaf223fe071a M     unicode
bisect run success

/cc @mpvl

@SamWhited SamWhited added the NeedsFix The path to resolution is known, but the work has not been done. label Apr 24, 2017
@gopherbot
Copy link

CL https://golang.org/cl/41680 mentions this issue.

@SamWhited SamWhited self-assigned this Apr 25, 2017
@gopherbot
Copy link

CL https://golang.org/cl/41697 mentions this issue.

@golang golang locked and limited conversation to collaborators Apr 27, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

3 participants