Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/user: non-cgo implementation of LookupGroup and LookupGroupId can fail when large groups are present in /etc/group #43636

Closed
rturner3 opened this issue Jan 12, 2021 · 18 comments
Labels
FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@rturner3
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.15.6 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build513596641=/tmp/go-build -gno-record-gcc-switches"

What did you do?

  1. Create a large Linux group in /etc/group.
  2. From a Go program built with CGO_ENABLED=0, call LookupGroup() or LookupGroupId() from the os/user package.

The provided example is run in a Docker container in order to not pollute the local host's users and groups. Note that the script executed to create a large group in /etc/group may take a few minutes to execute.

$ docker run -it golang:1.15.6-buster
# cat > setup_large_group.sh << EOF
#!/bin/bash

groupadd largegroup
for i in \$(seq 1 7500); do
    user="user\${i}"
    useradd  "\${user}"
    usermod -a -G largegroup  "\${user}"
done
EOF

# chmod +x setup_large_group.sh
# ./setup_large_group.sh
# wc -L /etc/group
66410 /etc/group
# cat > lookup_group.go << EOF
package main

import (
    "fmt"
    "os"
    "os/user"
)

func main() {
    gid := os.Args[1]
    group, err := user.LookupGroupId(gid)
    if err != nil {
        fmt.Fprintf(os.Stderr, "user.LookupGroupId: %v\n", err)
        os.Exit(1)
    }

    fmt.Println(group)
}
EOF

# CGO_ENABLED=0 go build lookup_group.go
# export LARGEGROUPID=$(cut -d: -f3 < <(getent group largegroup))
# ./lookup_group "${LARGEGROUPID}"

What did you expect to see?

&{1000 largegroup}

What did you see instead?

user.LookupGroupId: bufio.Scanner: token too long

This appears to happen when the length of a line in /etc/group exceeds bufio.MaxScanTokenSize bytes (currently 65536 bytes) in length. The implementation in lookup_unix.go does not gracefully handle lines in /etc/group exceeding 65536 bytes in length in its usage of bufio.Scanner.

As far as I know, there are no documented limits to the size of a group in Linux, so ideally the Go implementations of LookupGroup() and LookupGroupId() should work regardless of whether the binary is compiled with cgo or not.

One possible solution could be to read in the file using a bufio.Reader by calling r.ReadLine().

@seankhliao
Copy link
Member

for anyone trying at home: the setup can be just (so it doesn't take forever):

echo -n 'largegroup:x:1000:' >> /etc/group
for i in $(seq 1 7500); do echo -n user$i, >> /etc/group ; done

@seankhliao seankhliao changed the title os/user non-cgo implementation of LookupGroup and LookupGroupId can fail when large groups are present in /etc/group os/user: non-cgo implementation of LookupGroup and LookupGroupId can fail when large groups are present in /etc/group Jan 12, 2021
@seankhliao seankhliao added the NeedsFix The path to resolution is known, but the work has not been done. label Jan 12, 2021
@ianlancetaylor ianlancetaylor added this to the Backlog milestone Jan 12, 2021
@tpaschalis
Copy link
Contributor

I'll take a stab and see what can be done!

@andreybokhanko
Copy link
Contributor

I idependently starteded to work on this yesterday and missed @tpaschalis comment (my bad for not letting anyone else know!)

Either way, I have a patch (applicable against today's master) that fixes the bug and passes all the tests.

Here it goes:

diff --git a/src/os/user/lookup_unix.go b/src/os/user/lookup_unix.go
index 0890cd8f2b..35aca601d0 100644
--- a/src/os/user/lookup_unix.go
+++ b/src/os/user/lookup_unix.go
@@ -9,7 +9,6 @@ package user

import (
        "bufio"
-       "bytes"
        "errors"
        "io"
        "os"
@@ -20,26 +19,26 @@ import (
const groupFile = "/etc/group"
const userFile = "/etc/passwd"

-var colon = []byte{':'}
+var colon = string(':')

func init() {
        groupImplemented = false
}

// lineFunc returns a value, an error, or (nil, nil) to skip the row.
-type lineFunc func(line []byte) (v interface{}, err error)
+type lineFunc func(line string) (v interface{}, err error)

// readColonFile parses r as an /etc/group or /etc/passwd style file, running
// fn for each row. readColonFile returns a value, an error, or (nil, nil) if
// the end of the file is reached without a match.
func readColonFile(r io.Reader, fn lineFunc) (v interface{}, err error) {
-       bs := bufio.NewScanner(r)
-       for bs.Scan() {
-               line := bs.Bytes()
+       rd := bufio.NewReader(r)
+       var line string
+       for line, err = rd.ReadString('\n'); err == nil; line, err = rd.ReadString('\n') {
                // There's no spec for /etc/passwd or /etc/group, but we try to follow
                // the same rules as the glibc parser, which allows comments and blank
                // space at the beginning of a line.
-               line = bytes.TrimSpace(line)
+               line = strings.TrimSpace(line)
                if len(line) == 0 || line[0] == '#' {
                        continue
                }
@@ -48,7 +47,7 @@ func readColonFile(r io.Reader, fn lineFunc) (v interface{}, err error) {
                        return
                }
        }
-       return nil, bs.Err()
+       return nil, err
}

func matchGroupIndexValue(value string, idx int) lineFunc {
@@ -56,9 +55,9 @@ func matchGroupIndexValue(value string, idx int) lineFunc {
        if idx > 0 {
                leadColon = ":"
        }
-       substr := []byte(leadColon + value + ":")
-       return func(line []byte) (v interface{}, err error) {
-               if !bytes.Contains(line, substr) || bytes.Count(line, colon) < 3 {
+       substr := leadColon + value + ":"
+       return func(line string) (v interface{}, err error) {
+               if !strings.Contains(line, substr) || strings.Count(line, colon) < 3 {
                        return
                }
                // wheel:*:0:root
@@ -103,13 +102,13 @@ func matchUserIndexValue(value string, idx int) lineFunc {
        if idx > 0 {
                leadColon = ":"
        }
-       substr := []byte(leadColon + value + ":")
-       return func(line []byte) (v interface{}, err error) {
-               if !bytes.Contains(line, substr) || bytes.Count(line, colon) < 6 {
+       substr := leadColon + value + ":"
+       return func(line string) (v interface{}, err error) {
+               if !strings.Contains(line, substr) || strings.Count(line, colon) < 6 {
                        return
                }
                // kevin:x:1005:1006::/home/kevin:/usr/bin/zsh
-               parts := strings.SplitN(string(line), ":", 7)
+               parts := strings.SplitN(line, ":", 7)
                if len(parts) < 6 || parts[idx] != value || parts[0] == "" ||
                        parts[0][0] == '+' || parts[0][0] == '-' {
                        return

@tpaschalis , if you already knee-deep in cooking up a patch for this, by all means please continue and submit your patch (would be happy to be one of reviewers). If you haven't started yet, perhaps I can submit this patch and save you some time.

Yours,
Andrey

Director
Advanced Software Technology Lab
Huawei

@tpaschalis
Copy link
Contributor

tpaschalis commented Jan 14, 2021

Hello @andreybokhanko! I hoped I'd find some time to do it yesterday, but you beat me to it.

Feel free to submit your solution, and CC me as well if you'd like!

@gopherbot
Copy link

Change https://golang.org/cl/283601 mentions this issue: os/user: make user.LookupGroupId function work correctly with large

@andreybokhanko
Copy link
Contributor

andreybokhanko commented Jan 14, 2021

@tpaschalis, 👍

Sorry for causing confusion... it's funny that out of 771 currently open tickets with "help wanted" badge, we both chose the same one. But sometimes this happens. 😀

I've just submitted a pull request: https://go-review.googlesource.com/c/go/+/283601

@andreybokhanko
Copy link
Contributor

@ianlancetaylor Could you, please, assign this issue on me? It seems that I can't do so myself.

@nightlyone
Copy link
Contributor

@andreybokhanko what do you think about calling the method Buffer to configure a maximum token size of 4 << 20 (4MB) or bigger instead? That seems like a simpler fix to me.

One actually wants some limit here, but 64k might be too small these days.

What do glibc and musl libc use here? Do they have the same problems?

@rturner3
Copy link
Author

Thanks @andreybokhanko for jumping on this so quickly! Just wanted to say that it would also be great if as part of this issue resolution some unit tests were added that cover these kinds of cases with large lines in /etc/passwd and /etc/group.

Appreciate the timely response from the community on this one!

@andreybokhanko
Copy link
Contributor

andreybokhanko commented Jan 15, 2021

@andreybokhanko what do you think about calling the method Buffer to configure a maximum token size of 4 << 20 (4MB) or bigger instead? That seems like a simpler fix to me.

One actually wants some limit here, but 64k might be too small these days.

Hi @nightlyone , that's a viable design alternative, indeed.

What I don't like with this approach, though, is that we are essentially replacing one arbitrary limitation with another. Yep, 4MB might look more than enough today, but so was 64K some time ago... :)

What do glibc and musl libc use here? Do they have the same problems?

They have _SC_GETGR_R_SIZE_MAX constant for maximum size, that they get from OS (https://www.gnu.org/software/libc/manual/html_node/Constants-for-Sysconf.html). This is actually what CGO implementation of lookup functions uses as well.

As I understand, we can't use this C library constant in non-CGO implementation.

@andreybokhanko
Copy link
Contributor

Thanks @andreybokhanko for jumping on this so quickly! Just wanted to say that it would also be great if as part of this issue resolution some unit tests were added that cover these kinds of cases with large lines in /etc/passwd and /etc/group.

Fully agree. I'm new in Go community, so don't really know unit testing requirements here; but an extra unit test never hurts (well, almost never :))

I've just amended my patch with a unit test.

@nightlyone
Copy link
Contributor

@andreybokhanko _SC_GETGR_R_SIZE_MAX seems to be only some initial buffer size and not the actual limit as explained in https://tomlee.co/2012/10/problems-with-large-linux-unix-groups-and-getgrgid_r-getgrnam_r/

So looks like there is no actual practical limit and your approach of using an unbounded line buffer is the correct implementation.

@ianlancetaylor any opinions from your side?

@ianlancetaylor
Copy link
Contributor

I don't see any reason to use an unbounded buffer or to read the whole line into memory. Unless I'm missing something we never care about the members of the group. We can just read the first three fields and then skip to the end of the line without saving the data anywhere. This is admittedly a bigger reworking of this code.

@andreybokhanko
Copy link
Contributor

Hi @ianlancetaylor

I don't see any reason to use an unbounded buffer or to read the whole line into memory.

As @nightlyone confirmed, there is no actual practical limit to the group name -- that's why we have to use an unbounded buffer.

As for reading the rest of the line (with members of the group) into memory -- sure, we can skip this. And yes, this would require a much hairier and uglier implementation -- including re-implementing matchGroupIndexValue function from scratch, instead of calling readColonFile with a handler passed as an argument. readColonFile makes no assumptions on what kind of handler is passed -- whatever it needs group name, user list, or anything else -- so it requires full line to be present (see this part of my review for details). Also, giving that we already have to use an unbounded buffer (to keep and compare group name), this won't bring us any tangible savings in speed or memory, unless I'm missing something?

Andrey

@ianlancetaylor
Copy link
Contributor

While in principle the group name has unbounded length, in practice the list of group members is where lines get long. It's worth not bothering to save those group members in memory when we don't have to.

Yes, it's a bigger change to the current code, but I think it's the right place to end up. readColonFile is a special purpose function, it's OK to change the way that it works.

@andreybokhanko
Copy link
Contributor

While in principle the group name has unbounded length, in practice the list of group members is where lines get long. It's worth not bothering to save those group members in memory when we don't have to.

Yes, it's a bigger change to the current code, but I think it's the right place to end up. readColonFile is a special purpose function, it's OK to change the way that it works.

OK, got it. Will re-implement as you advised. Thanks!

Andrey

@andreybokhanko
Copy link
Contributor

Hi @ianlancetaylor , I re-implemented the fix as you advised (well, if I understood you well). Now we read rows in 4096 byte chunks.

Please kindly take a look: https://go-review.googlesource.com/c/go/+/283601

Andrey

@andreybokhanko
Copy link
Contributor

This is fixed by I453321f1ab15fd4d0002f97fcec7d0789e1e0da5; should be available in the next Go release.

@rturner3 thanks for reporting! @ianlancetaylor thanks for the thorough reivew!

Andrey

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

7 participants