New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/text/cases: Upper drops the 129th character #11460
Comments
Greek works fine. print(cases.Upper(language.Make("el")).String(text))
|
The problem starts in transform/transform.go (line 571): nDst, nSrc, err = t.Transform(dst[pDst:], src[:n], pSrc+n == len(s)) At this point, func (t *undUpperCaser) Transform(dst, src []byte, atEOF bool) (nDst, nSrc int, err error) {
c := context{dst: dst, src: src, atEOF: atEOF}
for c.next() {
upper(&c)
}
// Standard upper case does not need any lookahead so we can safely not use
// the checkpointing mechanism. pDst and pSrc will always point to the
// furthest possible position.
return c.pDst, c.pSrc, c.err
} But unfortunately func (c *context) next() bool {
c.pSrc += c.sz
if c.pSrc == len(c.src) || c.err != nil {
c.info, c.sz = 0, 0
return false
}
v, sz := trie.lookup(c.src[c.pSrc:])
c.info, c.sz = info(v), sz
if c.sz == 0 {
if c.atEOF {
// A zero size means we have an incomplete rune. If we are atEOF,
// this means it is an illegal rune, which we will consume one
// byte at a time.
c.sz = 1
} else {
c.err = transform.ErrShortSrc
return false
}
}
return true
} Back in transform/transform.go, this means I see two solutions. First, if passing an empty if pDst+nDst < initialBufSize { If instead an empty func (c *context) next() bool {
if len(c.dst) == 0 {
c.err = transform.ErrShortDst
return false
}
c.pSrc += c.sz
if c.pSrc == len(c.src) || c.err != nil {
c.info, c.sz = 0, 0
return false
}
v, sz := trie.lookup(c.src[c.pSrc:])
c.info, c.sz = info(v), sz
if c.sz == 0 {
if c.atEOF {
// A zero size means we have an incomplete rune. If we are atEOF,
// this means it is an illegal rune, which we will consume one
// byte at a time.
c.sz = 1
} else {
c.err = transform.ErrShortSrc
return false
}
}
return true
} |
Resolved, see https://go-review.googlesource.com/#/c/13076/. |
Here is a simple program that should uppercase some text:
I would expect to get:
but instead I get:
The last "D" is dropped, which is the 129th character.
The text was updated successfully, but these errors were encountered: