Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/poll: CopyFileRange returns EPERM on CircleCI Docker Host running 4.10.0-40-generic #40893

Closed
thoeni opened this issue Aug 19, 2020 · 15 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@thoeni
Copy link
Contributor

thoeni commented Aug 19, 2020

What version of Go are you using (go version)?

$ go version
go version go1.15 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

Docker Host running Ubuntu 18.04 on Linux Kernel 4.10.0-40-generic

This is the Go env of the container trying to build:

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/circleci/.cache/go-build"
GOENV="/home/circleci/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/circleci/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/circleci/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/circleci/project/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build612039276=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I tried to build a Go binary (with Go 1.15) within a Docker container (from gimg/go:1.15) running on a CircleCI remote Docker engine (17.11.0-ce).

What did you expect to see?

Docker image should have been built correctly, and my binary compiled successfully with Go 1.15.

What did you see instead?

An error like this for every binary in my project:

/usr/local/go/pkg/tool/linux_amd64/link: cannot write /tmp/go-link-518048351/000003.o: write /tmp/go-link-518048351/000003.o: copy_file_range: operation not permitted
/usr/local/go/pkg/tool/linux_amd64/link: cannot write /tmp/go-link-518048351/000035.o: write /tmp/go-link-518048351/000035.o: copy_file_range: operation not permitted
The command '/bin/bash -exo pipefail -c go install -mod=vendor -v ./...' returned a non-zero code: 2

Details from my investigation:

I build my binaries on CircleCI in a Docker container. CircleCI allows the user to pin a specific Docker engine, but they default to a version (17.11.0-ce) that runs on a Docker Host powered by Linux Kernel 4.10.0-40-generic.

Despite the above kernel does support the directive copy_file_range, for some restrictions possibly setup by the vendor, the Go linker gets this error:

usr/local/go/pkg/tool/linux_amd64/link: cannot write /tmp/go-link-518048351/000003.o: write /tmp/go-link-518048351/000003.o: copy_file_range: operation not permitted

This seems to correspond to the EPERM error.

I've filed an issue report with the vendor asking why this is happening and I'm waiting for a confirmation: is this a scenario that we would like to take into account, and fallback as well to the other approach?
I assume the EPERM can be returned also in the case of the user not having access to a file, therefore between the two branches of this switch I guess this scenario would be more similar to the second case where we might want a one-off fallback.

I spawn up a VirtualBox environment and tried to build the same code that breaks on CircleCI on three different kernels and all the builds were successful, proving that the issue I've encountered is probably related to some limitation/sandboxing artificially imposed by the vendor:

  • 4.15.0-45-generic <- the officially LTS supported by Ubuntu 16.04
  • 4.10.0-40-generic <- the one used by CircleCI
  • 4.4.232-0404232-generic<- the latest prior to the introduction of copy_file_range

Unfortunately I can't really reproduce the exact setup as CircleCI has its own internal AMIs/images that are not accessible to me, but I asked for the Kernel details and they told me the Docker Host that breaks my linker/builds runs on 4.10.0-40-generic.

The above has been posted as a comment on #40731 but for easier traceability I've opened a dedicated issue.

@davecheney
Copy link
Contributor

/cc @ianlancetaylor

@thoeni thanks for raising a new issue. Would you be able to run some experiments to try to isolate the problem. The main one I'm thinking is operation not permitted is what is says -- the user running that build isn't able to write to /tmp. Would you be able to rejig your image build process to work under something from https://hub.docker.com/_/golang. It looks like those images are build against a deb based system so there shouldn't be many surprises. I'd be interested in knowing if a different base image produced the same permission errors.

@davecheney
Copy link
Contributor

Also paging @tianon who knows more about this stuff than the rest of us put together.

@thoeni
Copy link
Contributor Author

thoeni commented Aug 19, 2020

/cc @ianlancetaylor

Would you be able to rejig your image build process to work under something from https://hub.docker.com/_/golang.

I originally started from golang:1.15.0-alpine3.12 which is where I first encountered the error, and raised the issue with CircleCI: they then asked me to try their own container in order to make sure it was not related to the container, and the error seems to happen regardless.

I will try to create a public project with a simple app and the .circleci configuration that triggers it to allow people to reproduce it on CircleCI, but my build breaks all the time regardless of the docker image used for the build, and the way I resolved it was migrating to a different CircleCI remote docker engine (18.06.0-ce) which runs on Kernel 4.15.0-1014-gcp. This allows the build to work fine on any container I tried.

Also the Docker Host that currently triggers a failure worked fine with containers up to 1.14.7.

Hope this helps, but I'll try to create a test project to allow others to reproduce it.

@tianon
Copy link
Contributor

tianon commented Aug 19, 2020

Given the error (and what's already been ruled out 💪), my initial guess would be seccomp or apparmor related -- does CircleCI use a profile for either of those that's more restrictive than Docker's defaults? Maybe some extra capabilities they drop?

@tklauser
Copy link
Member

tklauser commented Aug 19, 2020

It looks like is an issue similar to #40731. AFAIK, processes running inside Docker may return EPERM for unimplemented or unsupported syscalls due to its default seccomp policy (we had this case a few times in x/sys/unix already). I've sent https://golang.org/cl/249257 with a possible fix.

/cc @ianlancetaylor

@tianon
Copy link
Contributor

tianon commented Aug 19, 2020

Also, 17.11.0-ce (and Linux 4.10) are both pretty old in terms of containers. 😅

@gopherbot
Copy link

Change https://golang.org/cl/249257 mentions this issue: internal/poll: treat copy_file_range EPERM as not-handled

@tklauser
Copy link
Member

@gopherbot please consider backporting this to 1.15.

The use of copy_file_range is new in 1.15 and this causes failures doing file system operations in containers running on older kernels which do not provide the copy_file_range syscall.

@gopherbot
Copy link

Backport issue(s) opened: #40900 (for 1.15).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

@tianon
Copy link
Contributor

tianon commented Aug 19, 2020

Ahh, so this could also be related to the version of libseccomp that their build of 17.11.0-ce is compiled against. 😅

(See also moby/moby#40734 for a recent example of newer kernel APIs being blocked by default thanks to that.)

If you can run with the equivalent of --security-opt seccomp:unconfined, that would be a good test to confirm.

@thoeni
Copy link
Contributor Author

thoeni commented Aug 19, 2020

If you can run with the equivalent of --security-opt seccomp:unconfined, that would be a good test to confirm.

@tianon I can definitely try. Just to clarify, since my build is failing on docker build, can --security-opt seccomp:unconfined be used on build or only on docker run?

@thoeni
Copy link
Contributor Author

thoeni commented Aug 19, 2020

Sorry, got the answer already:

#!/bin/bash -eo pipefail
docker build --security-opt seccomp:unconfined -f Dockerfile -t "${CIRCLE_PROJECT_REPONAME}:${CIRCLE_SHA1}" .

Error response from daemon: The daemon on this platform does not support setting security options on build
ERRO[0000] Can't add file /root/project/go.sum to tar: io: read/write on closed pipe 
ERRO[0000] Can't close tar writer: io: read/write on closed pipe 

Exited with code exit status 1

@tianon
Copy link
Contributor

tianon commented Aug 19, 2020

It should work on build (definitely does in newer versions) but I'm not 100% certain how far back it goes.

@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 19, 2020
@dmitshur dmitshur added this to the Go1.16 milestone Aug 19, 2020
@moul
Copy link

moul commented Aug 19, 2020

fyi, I followed this advice from https://discuss.circleci.com/t/cgo-docker-build-failing-at-link-stage-operation-not-permitted/37100 and it fixed my similar issue:

      - setup_remote_docker: 
            version: 18.09.3

janisz added a commit to janisz/dcos-cli that referenced this issue Aug 20, 2020
Use newer base kernel as recommended in
golang/go#40893
janisz added a commit to janisz/dcos-cli that referenced this issue Aug 20, 2020
Use newer base kernel as recommended in
golang/go#40893
janisz added a commit to dcos/dcos-cli that referenced this issue Aug 21, 2020
* Set minimum TLS version to 1.2

It's now the industry standard to deprecate TLS 1.0 and 1.1.

See: https://tools.ietf.org/html/draft-ietf-tls-oldversions-deprecate-00

* Bump Go 1.15

Currently DC/OS CLI keeps backward compatibility for
macOS 10.10 Yosemite (2014) after updating Go version
we will drop it and support only macOS 10.12 Sierra or later.

* Use mesos image to build binaries

Use newer base kernel as recommended in
golang/go#40893

* Update CHANGELOG.md
@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Aug 21, 2020
@gopherbot
Copy link

Change https://golang.org/cl/249897 mentions this issue: [release-branch.go1.15] internal/poll: treat copy_file_range EPERM as not-handled

gopherbot pushed a commit that referenced this issue Aug 22, 2020
… not-handled

Updates #40893.
Fixes #40900.

Change-Id: I938ea4796c1e1d1e136117fe78b06ad6da8e40de
Reviewed-on: https://go-review.googlesource.com/c/go/+/249257
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Antonio Troina <thoeni@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
(cherry picked from commit b0cc02e)
Reviewed-on: https://go-review.googlesource.com/c/go/+/249897
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
anonymouse64 added a commit to anonymouse64/snapd that referenced this issue Aug 27, 2020
This was recently introduced as an optimization to Go 1.15, and so apps that
start compiling may start to try and use it.

Note that as of this commit, Go 1.15 does not fall back, and so apps that use
this will fail outright, but there is work upstream in Go to fix this so that
apps that get denied usage of copy_file_range with an EPERM will fallback to
potentially slower implementations.

See golang/go#40893 and
https://go-review.googlesource.com/c/go/+/249257/ for more details on the Go
issue and the fallback implementation.

Signed-off-by: Ian Johnson <ian.johnson@canonical.com>
anonymouse64 added a commit to anonymouse64/snapd that referenced this issue Sep 1, 2020
This was recently introduced as an optimization to Go 1.15, and so apps that
start compiling may start to try and use it.

Note that as of this commit, Go 1.15 does not fall back, and so apps that use
this will fail outright, but there is work upstream in Go to fix this so that
apps that get denied usage of copy_file_range with an EPERM will fallback to
potentially slower implementations.

See golang/go#40893 and
https://go-review.googlesource.com/c/go/+/249257/ for more details on the Go
issue and the fallback implementation.

Signed-off-by: Ian Johnson <ian.johnson@canonical.com>
anonymouse64 added a commit to anonymouse64/snapd that referenced this issue Nov 25, 2020
This was recently introduced as an optimization to Go 1.15, and so apps that
start compiling may start to try and use it.

Note that as of this commit, Go 1.15 does not fall back, and so apps that use
this will fail outright, but there is work upstream in Go to fix this so that
apps that get denied usage of copy_file_range with an EPERM will fallback to
potentially slower implementations.

See golang/go#40893 and
https://go-review.googlesource.com/c/go/+/249257/ for more details on the Go
issue and the fallback implementation.

Signed-off-by: Ian Johnson <ian.johnson@canonical.com>
anonymouse64 added a commit to anonymouse64/snapd that referenced this issue Nov 25, 2020
This was recently introduced as an optimization to Go 1.15, and so apps that
start compiling may start to try and use it.

Note that Go 1.15 does now currently fall back to using other methods if copy_file_range
returns EPERM so that apps that get denied usage of copy_file_range with an EPERM 
will fallback to potentially slower implementations.

See golang/go#40893 and
https://go-review.googlesource.com/c/go/+/249257/ for more details on the Go
issue and the fallback implementation.

Signed-off-by: Ian Johnson <ian.johnson@canonical.com>
anonymouse64 added a commit to anonymouse64/snapd that referenced this issue Dec 2, 2020
This was recently introduced as an optimization to Go 1.15, and so apps that
start compiling may start to try and use it.

Note that Go 1.15 does now currently fall back to using other methods if copy_file_range
returns EPERM so that apps that get denied usage of copy_file_range with an EPERM 
will fallback to potentially slower implementations.

See golang/go#40893 and
https://go-review.googlesource.com/c/go/+/249257/ for more details on the Go
issue and the fallback implementation.

Signed-off-by: Ian Johnson <ian.johnson@canonical.com>
anonymouse64 added a commit to snapcore/snapd that referenced this issue Feb 13, 2021
…ccomp-default

interfaces/seccomp/template.go: allow copy_file_range

This was recently introduced as an optimization to Go 1.15, and so apps that
start compiling may start to try and use it.

Note that Go 1.15 does currently fall back to using other methods if copy_file_range
returns EPERM so that apps that get denied usage of copy_file_range will fallback
to potentially slower implementations. (originally upon Go 1.15 release there
was not a fallback implementation and the app would just crash returning a non-nil
error up the stack).

See golang/go#40893 and
https://go-review.googlesource.com/c/go/+/249257/ for more details on the Go
issue and the fallback implementation.

There are also some instances of Node.JS using this too with the libuv library, see 
fs.copyfile() and a corresponding forum topic for more details:
https://forum.snapcraft.io/t/snap-no-longer-has-write-permission/22686
mvo5 pushed a commit to snapcore/snapd that referenced this issue Feb 26, 2021
This was recently introduced as an optimization to Go 1.15, and so apps that
start compiling may start to try and use it.

Note that Go 1.15 does now currently fall back to using other methods if copy_file_range
returns EPERM so that apps that get denied usage of copy_file_range with an EPERM 
will fallback to potentially slower implementations.

See golang/go#40893 and
https://go-review.googlesource.com/c/go/+/249257/ for more details on the Go
issue and the fallback implementation.

Signed-off-by: Ian Johnson <ian.johnson@canonical.com>
brmzkw added a commit to openmaraude/console that referenced this issue Mar 28, 2021
Related to golang/go#40893

The command "npm run export" generates an error, because it uses the
unimplemented syscall "copyfile". This commit attempts to pin Docker
version.
@golang golang locked and limited conversation to collaborators Aug 21, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

7 participants