Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: scaleway builders write_snapshot_tar consistently taking > 1h #21839

Closed
s-mang opened this issue Sep 11, 2017 · 6 comments
Closed

x/build: scaleway builders write_snapshot_tar consistently taking > 1h #21839

s-mang opened this issue Sep 11, 2017 · 6 comments
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge
Milestone

Comments

@s-mang s-mang self-assigned this Sep 11, 2017
@s-mang s-mang changed the title x/build: scaleway builders: write_snapshot_tar consistently taking > 1h x/build: scaleway builders write_snapshot_tar consistently taking > 1h Sep 11, 2017
@gopherbot gopherbot added this to the Unreleased milestone Sep 11, 2017
@gopherbot gopherbot added the Builders x/build issues (builders, bots, dashboards) label Sep 11, 2017
@s-mang
Copy link
Contributor Author

s-mang commented Sep 11, 2017

gopherbot does not seem to like a second colon in the title.

@dsnet
Copy link
Member

dsnet commented Sep 11, 2017

Unfortunately all of the links you posted are now stale and returning 404s :(

@s-mang
Copy link
Contributor Author

s-mang commented Sep 11, 2017

arg shoot.

still had windows open. uploading as text files.
some example culprit lines:

4:
2017-09-11T18:23:16Z write_snapshot_tar
2017-09-11T19:44:11Z finish_write_snapshot_tar after 1h20m54.5s

5:
2017-09-11T18:49:54Z write_snapshot_tar
2017-09-11T20:26:43Z finish_write_snapshot_tar after 1h36m48.4s

6:
2017-09-11T18:49:52Z write_snapshot_tar
2017-09-11T20:09:12Z finish_write_snapshot_tar after 1h19m20s

7:
2017-09-11T19:04:35Z write_snapshot_tar
2017-09-11T20:38:02Z finish_write_snapshot_tar after 1h33m26.9s

temporarylogs.txt
temporarylogs2.txt
temporarylogs3.txt
temporarylogs4.txt
temporarylogs5.txt
temporarylogs6.txt
temporarylogs7.txt

@bradfitz
Copy link
Contributor

Looks like a configuration problem means the --workdir being passed to the buildlet isn't the tmpfs mount, so it's doing lots of slow I/O over the network (Scaleway's disk is an NBD device, IIRC).

I saw during a build:

:: Running /tmp/workdir/go/src/make.bash with args ["/tmp/workdir/go/src/make.bash"] and env ["PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" "HOSTNAME=scaleway-prod-07" "ARCH=armv7l" "UBUNTU_SUITE=xenial" "DOCKER_REPO=multiarch/ubuntu-debootstrap" "DEBIAN_FRONTEND=noninteractive" "SCW_BASE_IMAGE=scaleway/ubuntu:xenial" "GO_BOOTSTRAP=/usr/local/go" "GO_BUILD_KEY_PATH=/buildkey/gobuildkey" "GO_BUILD_KEY_DELETE_AFTER_READ=true" "IN_KUBERNETES=1" "GO_BUILDER_ENV=host-linux-arm-scaleway" "META_BUILDLET_BINARY_URL=https://storage.googleapis.com/go-builder-data/buildlet.linux-arm" "HOME=/root" "USER=root" "WORKDIR=/tmp/workdir" "GOROOT_BOOTSTRAP=/usr/local/go" "GO_BUILDER_NAME=linux-arm" "GO_BUILDER_FLAKY_NET=1" "GOBIN="] in dir /tmp/workdir/go/src

(With /tmp/workdir).

But in rundockerbuildlet I see:

                out, err := exec.Command("docker", "run",
                        "-d",
                        "--memory="+*memory,
                        "--name="+name,
                        "-v", filepath.Dir(keyFile)+":/buildkey/",
                        "-e", "HOSTNAME="+name,
                        "--tmpfs=/workdir:rw,exec",
                        *image).CombinedOutput()

... putting a tmpfs inside the container at /workdir.

@bradfitz
Copy link
Contributor

Yup, confirmed with a gomote ssh session to a linux-arm builder.

$ gomote ssh user-bradfitz-linux-arm-0
$ ssh -p 2222 user-bradfitz-linux-arm-0@farmer.golang.org # auth using https://github.com/bradfitz.keys
# Welcome to the gomote ssh proxy, bradfitz.
# Connecting to/starting remote ssh...
#
# `gomote push` and the builders use:
# - workdir: /tmp/workdir
# - GOROOT: /tmp/workdir/go
# - GOPATH: /tmp/workdir/gopath
# - env: GO_BUILDER_NAME=linux-arm GO_BUILDER_FLAKY_NET=1 GOROOT_BOOTSTRAP=/usr/local/go
# Happy debugging.
....
root@d41b38a67ad1:~# df
Filesystem     1K-blocks    Used Available Use% Mounted on
none            47929956 6711792  38760376  15% /
tmpfs            1033900       0   1033900   0% /dev
tmpfs            1033900       0   1033900   0% /sys/fs/cgroup
/dev/nbd0       47929956 6711792  38760376  15% /buildkey
tmpfs            1033900       0   1033900   0% /workdir
shm                65536       0     65536   0% /dev/shm
root@d41b38a67ad1:~# df /tmp/workdir
Filesystem     1K-blocks    Used Available Use% Mounted on
none            47929956 6711792  38760376  15% /
root@d41b38a67ad1:~# cat /proc/mounts 
none / aufs rw,relatime,si=79190e56,dio,dirperm1 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev tmpfs rw,nosuid,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666 0 0
sysfs /sys sysfs ro,nosuid,nodev,noexec,relatime 0 0
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,relatime,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup ro,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup ro,nosuid,nodev,noexec,relatime,net_cls,net_prio 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup ro,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
cgroup /sys/fs/cgroup/freezer cgroup ro,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/cpuset cgroup ro,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/devices cgroup ro,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/perf_event cgroup ro,nosuid,nodev,noexec,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/memory cgroup ro,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/blkio cgroup ro,nosuid,nodev,noexec,relatime,blkio 0 0
mqueue /dev/mqueue mqueue rw,nosuid,nodev,noexec,relatime 0 0
/dev/nbd0 /buildkey ext4 rw,relatime,data=ordered 0 0
tmpfs /workdir tmpfs rw,nosuid,nodev,relatime 0 0
/dev/nbd0 /etc/resolv.conf ext4 rw,relatime,data=ordered 0 0
/dev/nbd0 /etc/hostname ext4 rw,relatime,data=ordered 0 0
/dev/nbd0 /etc/hosts ext4 rw,relatime,data=ordered 0 0
shm /dev/shm tmpfs rw,nosuid,nodev,noexec,relatime,size=65536k 0 0
proc /proc/bus proc ro,nosuid,nodev,noexec,relatime 0 0
proc /proc/fs proc ro,nosuid,nodev,noexec,relatime 0 0
proc /proc/irq proc ro,nosuid,nodev,noexec,relatime 0 0
proc /proc/sys proc ro,nosuid,nodev,noexec,relatime 0 0
proc /proc/sysrq-trigger proc ro,nosuid,nodev,noexec,relatime 0 0
tmpfs /proc/timer_list tmpfs rw,nosuid,mode=755 0 0
tmpfs /proc/timer_stats tmpfs rw,nosuid,mode=755 0 0

@gopherbot
Copy link

Change https://golang.org/cl/77370 mentions this issue: cmd/buildlet: use tmpfs workdir if flag value is unspecified

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge
Projects
None yet
Development

No branches or pull requests

4 participants