Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: Scaleway linux-arm builders don't have vdso support #33574

Closed
ianlancetaylor opened this issue Aug 9, 2019 · 18 comments
Closed

x/build: Scaleway linux-arm builders don't have vdso support #33574

ianlancetaylor opened this issue Aug 9, 2019 · 18 comments
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done. new-builder
Milestone

Comments

@ianlancetaylor
Copy link
Contributor

As discussed at #32912, let's upgrade the linux-arm and linux-arm64 builders to bionic.

@ianlancetaylor ianlancetaylor added the NeedsFix The path to resolution is known, but the work has not been done. label Aug 9, 2019
@gopherbot gopherbot added this to the Unreleased milestone Aug 9, 2019
@gopherbot gopherbot added the Builders x/build issues (builders, bots, dashboards) label Aug 9, 2019
@bradfitz
Copy link
Contributor

The linux-arm and linux-arm64 builders are different in about all ways. arm is on scaleway, 1 builder per machine, 50 machines, with images prepared in a weird way. arm64 is on packet.net on a big machine, with builds running in Docker.

This issue should probably be split in two, with more context: is this about upgrading the host image (for a new kernel? which minimum level/feature?) or the container environment?

@ianlancetaylor
Copy link
Contributor Author

The current kernels used on both the linux-arm and linux-arm64 builders do not use VDSO for the time functions. The purpose of this issue is upgrade the kernel so that it uses VDSO on both kinds of systems. It is of course fine to split the issue in two.

@bradfitz
Copy link
Contributor

bradfitz commented Nov 5, 2019

Our existing linux-arm machines are Xenial, running some weird netboot kernel, 4.10.8-docker-1 (built Wed Apr 5 16:04:23 UTC 2017).

But I tried out a Bionic instance on the same hardware and the kernel is still weird, but newer: 4.9.93-mainline-rev1 #1 SMP Tue Apr 10 09:42:40 UTC 2018.

Unfortunately this C1 hardware (their only 32-bit ARM hardware) doesn't support local boot. It can only NFS boot kernels they provide, AFAICT: https://github.com/scaleway/image-ubuntu/issues/132

Probably good enough.

@bradfitz bradfitz assigned bradfitz and unassigned dmitshur Nov 6, 2019
@gopherbot
Copy link

Change https://golang.org/cl/205603 mentions this issue: all: upgrade scaleway linux-arm builders, stop using deprecated --reverse flag

gopherbot pushed a commit to golang/build that referenced this issue Nov 6, 2019
…erse flag

Updates golang/go#21260 (no more buildlets using the --reverse flag)
Updates golang/go#33574 (linux-arm kernel+userspace updated)

Change-Id: I7455f6fa3e851f1f9f81d6f1eb487ef7e4bea55b
Reviewed-on: https://go-review.googlesource.com/c/build/+/205603
Reviewed-by: Bryan C. Mills <bcmills@google.com>
@bradfitz
Copy link
Contributor

bradfitz commented Nov 6, 2019

Okay, linux-arm is now an Ubuntu bionic host (kernel 4.9.93) with Debian Buster containers. All deployed.

Arm64 (packet) remains.

@bradfitz
Copy link
Contributor

bradfitz commented Nov 6, 2019

The Arm64 packet host is Ubuntu Xenial (4.10.0-26-generic) with Debian Buster containers. Upgrading it remotely now over ssh. Hope it survives the process. Otherwise we'll have to re-create it.

@bradfitz
Copy link
Contributor

bradfitz commented Nov 6, 2019

The packet host upgraded & rebooted and everything seems to be Bionic now but the kernel is still the same:

root@go-builder:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.3 LTS
Release:        18.04
Codename:       bionic

root@go-builder:~# uname -a
Linux go-builder 4.10.0-26-generic #30~16.04.1-Ubuntu SMP Tue Jun 27 09:40:29 UTC 2017 aarch64 aarch64 aarch64 GNU/Linux

I see that it's supposed to be 4.15:
https://packages.ubuntu.com/bionic/linux-image-generic

And 4.15 is installed, but not in use:

root@go-builder:~# dpkg -l | grep linux-image
ii  linux-image-4.10.0-26-generic         4.10.0-26.30~16.04.1                            arm64        Linux kernel image for version 4.10.0 on ARMv8 SMP
ii  linux-image-4.15.0-68-generic         4.15.0-68.77                                    arm64        Linux kernel image for version 4.15.0 on ARMv8 SMP
ii  linux-image-extra-4.10.0-26-generic   4.10.0-26.30~16.04.1                            arm64        Linux kernel extra modules for version 4.10.0 on ARMv8 SMP
ii  linux-image-generic                   4.15.0.68.70                                    arm64        Generic Linux kernel image

@ianlancetaylor, when did Linux start using VDSOs on these architectures?

@ianlancetaylor
Copy link
Contributor Author

It's a configuration option when the kernel is built.

I believe that on ARM64 it was available as of kernel version 2.6.39.

@bradfitz
Copy link
Contributor

bradfitz commented Nov 6, 2019

https://blog.linuxplumbersconf.org/2016/ocw/system/presentations/3711/original/LPC_vDSO.pdf suggests arm got it in 4.1, but that means the old Xenial kernel should've been new enough.

904f046 has a test but doesn't t.Skip anywhere that I can see, so running that test wouldn't really tell me if the host supported it.

@bradfitz
Copy link
Contributor

bradfitz commented Nov 6, 2019

I see CONFIG_GENERIC_TIME_VSYSCALL=y on both the 4.10 and 4.15 packet kernels.

@ianlancetaylor
Copy link
Contributor Author

I think I decided that VDSO was not available based on /boot/config or /proc/config.gz, I don't remember which.

@ianlancetaylor
Copy link
Contributor Author

But if I made a mistake on that, I apologize. I was working on a test that should have failed when using VDSO, for #32912, which has since been fixed anyhow (the test is now TestVDSO in the runtime package).

@bradfitz
Copy link
Contributor

bradfitz commented Nov 6, 2019

On Packet, with the Linux 4.10.0-26-generic aarch64 kernel, I get:

# strace date
execve("/bin/date", ["date"], 0xffffe6d8f660 /* 20 vars */) = 0
brk(NULL)                               = 0xaaab0c3df000
faccessat(AT_FDCWD, "/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
faccessat(AT_FDCWD, "/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=26915, ...}) = 0
mmap(NULL, 26915, PROT_READ, MAP_PRIVATE, 3, 0) = 0xffffa88db000
close(3)                                = 0
faccessat(AT_FDCWD, "/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0 \10\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1341080, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffffa88d9000
mmap(NULL, 1409880, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xffffa875f000
mprotect(0xffffa889f000, 61440, PROT_NONE) = 0
mmap(0xffffa88ae000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x13f000) = 0xffffa88ae000
mmap(0xffffa88b4000, 13144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xffffa88b4000
close(3)                                = 0
mprotect(0xffffa88ae000, 16384, PROT_READ) = 0
mprotect(0xaaaae39b0000, 8192, PROT_READ) = 0
mprotect(0xffffa88e4000, 4096, PROT_READ) = 0
munmap(0xffffa88db000, 26915)           = 0
brk(NULL)                               = 0xaaab0c3df000
brk(0xaaab0c400000)                     = 0xaaab0c400000
openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=1683056, ...}) = 0
mmap(NULL, 1683056, PROT_READ, MAP_PRIVATE, 3, 0) = 0xffffa85c4000
close(3)                                = 0
openat(AT_FDCWD, "/etc/localtime", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=127, ...}) = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=127, ...}) = 0
read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\1\0\0\0\0"..., 4096) = 127
lseek(3, -71, SEEK_CUR)                 = 56
read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\1\0\0\0\0"..., 4096) = 71
close(3)                                = 0
fstat(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0
write(1, "Wed Nov  6 22:49:36 UTC 2019\n", 29Wed Nov  6 22:49:36 UTC 2019
) = 29
close(1)                                = 0
close(2)                                = 0
exit_group(0)              

... which certainly looks like it's not doing a system call for the time.

@bradfitz
Copy link
Contributor

bradfitz commented Nov 6, 2019

Also on that host:

root@go-builder:~# cat /proc/self/maps
aaaae578f000-aaaae5796000 r-xp 00000000 08:03 11010096                   /bin/cat
aaaae57a6000-aaaae57a7000 r--p 00007000 08:03 11010096                   /bin/cat
aaaae57a7000-aaaae57a8000 rw-p 00008000 08:03 11010096                   /bin/cat
aaab0a330000-aaab0a351000 rw-p 00000000 00:00 0                          [heap]
ffffadfb9000-ffffadfdb000 rw-p 00000000 00:00 0 
ffffadfdb000-ffffae176000 r--p 00000000 08:03 2752623                    /usr/lib/locale/locale-archive
ffffae176000-ffffae2b6000 r-xp 00000000 08:03 265458                     /lib/aarch64-linux-gnu/libc-2.27.so
ffffae2b6000-ffffae2c5000 ---p 00140000 08:03 265458                     /lib/aarch64-linux-gnu/libc-2.27.so
ffffae2c5000-ffffae2c9000 r--p 0013f000 08:03 265458                     /lib/aarch64-linux-gnu/libc-2.27.so
ffffae2c9000-ffffae2cb000 rw-p 00143000 08:03 265458                     /lib/aarch64-linux-gnu/libc-2.27.so
ffffae2cb000-ffffae2cf000 rw-p 00000000 00:00 0 
ffffae2cf000-ffffae2ec000 r-xp 00000000 08:03 262153                     /lib/aarch64-linux-gnu/ld-2.27.so
ffffae2f0000-ffffae2f2000 rw-p 00000000 00:00 0 
ffffae2f9000-ffffae2fa000 r--p 00000000 00:00 0                          [vvar]
ffffae2fa000-ffffae2fb000 r-xp 00000000 00:00 0                          [vdso]
ffffae2fb000-ffffae2fc000 r--p 0001c000 08:03 262153                     /lib/aarch64-linux-gnu/ld-2.27.so
ffffae2fc000-ffffae2fe000 rw-p 0001d000 08:03 262153                     /lib/aarch64-linux-gnu/ld-2.27.so
ffffd53f5000-ffffd5416000 rw-p 00000000 00:00 0                          [stack]

I see [vdso] there.

@bradfitz
Copy link
Contributor

bradfitz commented Nov 6, 2019

But on Scaleway, even with Bionic, the 4.9.93-mainline-rev1 armv7 kernel it netboots doesn't have vdso in /proc/self/maps and strace on date shows:

clock_gettime(CLOCK_REALTIME, {tv_sec=1573080827, tv_nsec=563875911}) = 0

And in /proc/config.gz:

root@bionic-tmpl:~# zless /proc/config.gz | grep VSYS
root@bionic-tmpl:~# zless /proc/config.gz | grep TIME
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_HIGH_RES_TIMERS=y
# CONFIG_IRQ_TIME_ACCOUNTING is not set
CONFIG_TIMERFD=y
CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
# CONFIG_HAVE_ARM_ARCH_TIMER is not set
# CONFIG_PARAVIRT_TIME_ACCOUNTING is not set
# CONFIG_NETWORK_PHY_TIMESTAMPING is not set
CONFIG_NF_CONNTRACK_TIMEOUT=y
CONFIG_NF_CONNTRACK_TIMESTAMP=y
CONFIG_NF_CT_NETLINK_TIMEOUT=m
CONFIG_NETFILTER_XT_TARGET_IDLETIMER=m
CONFIG_NETFILTER_XT_MATCH_TIME=m
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
# CONFIG_HW_RANDOM_TIMERIOMEM is not set
# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks.
# CONFIG_WATCHDOG_PRETIMEOUT_GOV is not set
CONFIG_SND_TIMER=m
CONFIG_SND_PCM_TIMER=y
# CONFIG_SND_HRTIMER is not set
# CONFIG_RTC_DRV_HID_SENSOR_TIME is not set
CONFIG_ARMADA_370_XP_TIMER=y
CONFIG_ORION_TIMER=y
# CONFIG_ARM_TIMER_SP804 is not set
# CONFIG_SH_TIMER_CMT is not set
# CONFIG_SH_TIMER_MTU2 is not set
# CONFIG_SH_TIMER_TMU is not set
# CONFIG_EM_TIMER_STI is not set
CONFIG_JFFS2_RTIME=y
CONFIG_PRINTK_TIME=y
CONFIG_PANIC_TIMEOUT=0
# CONFIG_DEBUG_TIMEKEEPING is not set
CONFIG_TIMER_STATS=y
CONFIG_RCU_CPU_STALL_TIMEOUT=21

Looks like it's just not built with vdso support.

But not sure we can build our own kernel+initrd "bootscript" (https://www.scaleway.com/en/docs/bootscript-and-how-to-use-it/) on Scaleway.

See also https://github.com/scaleway/image-ubuntu/issues/132 ... maybe we can kexec a newer one?

# zless /proc/config.gz | grep KEXEC
CONFIG_KEXEC_CORE=y
CONFIG_KEXEC=y

@bradfitz bradfitz changed the title x/build: update linux-arm and linux-arm64 builders to bionic x/build: Scaleway linux-arm builders don't have vdso support Nov 6, 2019
@bradfitz bradfitz removed their assignment Nov 6, 2019
@bradfitz
Copy link
Contributor

bradfitz commented Nov 6, 2019

/cc @cagedmantis

codebien pushed a commit to codebien/build that referenced this issue Nov 13, 2019
…erse flag

Updates golang/go#21260 (no more buildlets using the --reverse flag)
Updates golang/go#33574 (linux-arm kernel+userspace updated)

Change-Id: I7455f6fa3e851f1f9f81d6f1eb487ef7e4bea55b
Reviewed-on: https://go-review.googlesource.com/c/build/+/205603
Reviewed-by: Bryan C. Mills <bcmills@google.com>
@bcmills
Copy link
Contributor

bcmills commented Nov 11, 2021

The Scaleway builders were removed in June (#45066). Is this still an issue with the current linux-arm builders?

@dmitshur
Copy link
Contributor

dmitshur commented May 10, 2022

As of today, we have linux/arm and linux/arm64 AWS builders, and linux/arm64 Equinix (previously named Packet) builders. I tried to check and as far as I can tell it's available on both linux/arm64 builder types, but not the linux/arm (32-bit) AWS one.


I'm not sure if my check was thorough; I pieced together what I read in this issue and in #32912, plus searching on the internet. The /proc/config.gz file wasn't available on the AWS builders, so I used other signals. The relevant log:

~ $ gomote run user-$USER-linux-arm-aws-0 '/bin/bash' '-c' 'cat /proc/self/maps | grep vdso'
Error running run: exit status 1  # Error is due to 0 grep matches.
~ $ gomote run user-$USER-linux-arm-aws-0 '/bin/bash' '-c' 'ldd /bin/uname'                 
	libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0xf7cbe000)
	/lib/ld-linux-armhf.so.3 (0xf7dce000)

~ $ gomote run user-$USER-linux-arm64-aws-0 '/bin/bash' '-c' 'cat /proc/self/maps | grep vdso'
ffff84154000-ffff84155000 r-xp 00000000 00:00 0                          [vdso]
~ $ gomote run user-$USER-linux-arm64-aws-0 '/bin/bash' '-c' 'ldd /bin/uname'
	linux-vdso.so.1 (0x0000ffff8cee6000)
	libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff8cd2e000)
	/lib/ld-linux-aarch64.so.1 (0x0000ffff8ceb8000)

~ $ gomote run user-$USER-linux-arm64-packet-0 '/bin/bash' '-c' 'cat /proc/self/maps | grep vdso' 
ffff7de1c000-ffff7de1d000 r-xp 00000000 00:00 0                          [vdso]
~ $ gomote run user-$USER-linux-arm64-packet-0 '/bin/bash' '-c' 'ldd /bin/uname'
	linux-vdso.so.1 (0x0000ffff81311000)
	libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff81181000)
	/lib/ld-linux-aarch64.so.1 (0x0000aaaacf87b000)

If there's any future work to do, we should track it in new issue(s), so closing this one.

@dmitshur dmitshur added this to Done in Go Release Team May 10, 2022
@golang golang locked and limited conversation to collaborators May 10, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done. new-builder
Projects
Archived in project
Development

No branches or pull requests

5 participants