Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: clean GOCACHE based on disk usage #29561

Open
4a6f656c opened this issue Jan 4, 2019 · 31 comments
Open

cmd/go: clean GOCACHE based on disk usage #29561

4a6f656c opened this issue Jan 4, 2019 · 31 comments
Labels
FeatureRequest GoCommand cmd/go NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@4a6f656c
Copy link
Contributor

4a6f656c commented Jan 4, 2019

The GOCACHE appears to lack a disk size limit - this is a problem in a space constrained environment and/or when running go on a disk that is nearing capacity. For example, on the openbsd/arm builder (which runs on a USB stick), the ~/.cache/go-build directory runs past several GB in a very short time, which then leads to various failures (git clone or go builds). The only option that I currently appear to have is to run go cache -clean regularly, in order to keep the cache at a respectable size. It seems that having a configurable upper bound would be preferable and/or free disk space based checks that prevent writes (e.g. evict then write) to the cache from failing when the disk is full due, partly due to a large GOCACHE:

[gopher@cubox ~ 102]$ du -csh ~/.cache/go-build                                                                                                                                                                                 
2.2G    /home/gopher/.cache/go-build
2.2G    total
[gopher@cubox ~ 103]$ df -h /home
Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/sd1m      9.0G    8.5G   -6.0K   100%    /home
[gopher@cubox ~ 104]$ go build -o /tmp/main /tmp/main.go

/home: write failed, file system is full

/home: write failed, file system is full

/home: write failed, file system is full

/home: write failed, file system is full

/home: write failed, file system is full

/home: write failed, file system is full

/home: write failed, file system is full

/home: write failed, file system is full

/home: write failed, file system is full
[gopher@cubox ~ 105]$ du -csh ~/.cache/go-build          
2.2G    /home/gopher/.cache/go-build
2.2G    total
[gopher@cubox ~ 106]$ df -h /home                        
Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/sd1m      9.0G    8.5G   -6.0K   100%    /home
@jayconrod jayconrod added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. FeatureRequest GoCommand cmd/go labels Jan 4, 2019
@jayconrod jayconrod added this to the Go1.13 milestone Jan 4, 2019
@jayconrod
Copy link
Contributor

I think this would be a good thing to have, especially since the cache will be required starting in Go 1.12. I don't think there's a fixed maximum cache size that would work for everyone, but maybe we could at least make eviction more aggressive when the disk is low on space.

@mvdan
Copy link
Member

mvdan commented Jan 4, 2019

We're no longer allowing GOCACHE=off starting in 1.12, so it might be a bit tricky to allow an arbitrary size limit. One could simply swap GOCACHE=off with GOCACHELIMIT=0, for example.

I seem to remember that @rsc designed the cache to automatically evict based on modification time. Seems sound to me to also automatically evict if the disk is full enough that it might realistically hit errors if it builds a few large packages (say, if there's less than 500MB left).

@bitfield
Copy link

bitfield commented Jan 5, 2019

This sounds like a bit of a can of worms. How do you check for remaining disk space in a cross-platform way? How much free space should trigger a cache eviction? How do you guarantee that, having evicted some cache, the build process won't still run out of disk anyway?

If you're building on a small disk, then wouldn't running go cache -clean before each build be a better solution than trying to build disk space management into the build tool itself?

@mvdan
Copy link
Member

mvdan commented Jan 5, 2019

I personally don't know how easy it would be to control "does the disk have enough space left" in a sane and portable way. I was simply pointing out that since the current eviction algorithm takes into account timestamps, perhaps it should also use some data from the filesystem or disk. If we can make that work, we could avoid adding more knobs to the go tool.

If you're building on a small disk, then wouldn't running go cache -clean before each build be a better solution than trying to build disk space management into the build tool itself?

That might not be a great solution - what if you're building a very large project like Kubernetes? It might produce many hundreds of megabytes of build cache, so it's not unreasonable to think that it could on its own be enough to fill up some filesystems.

@bitfield
Copy link

bitfield commented Jan 5, 2019

That might not be a great solution - what if you're building a very large project like Kubernetes? It might produce many hundreds of megabytes of build cache, so it's not unreasonable to think that it could on its own be enough to fill up some filesystems.

Indeed. So this proposal would, at best, kick the problem slightly further down the road.

@bcmills bcmills modified the milestones: Go1.13, Unplanned Jan 15, 2019
@bcmills bcmills changed the title cmd/go/internal/cache: GOCACHE appears to lack disk size limit cmd/go: clean GOCACHE based on disk usage Jan 15, 2019
@josharian josharian modified the milestones: Unplanned, Go1.14 May 9, 2019
@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@josharian
Copy link
Contributor

I just filled up my hard drive again, and am waiting for go clean -cache to delete an untold number of tiny files, which history suggests will take hours.

Since GOCACHE=off is off, how about a cmd/go flag asking it not to cache the results of this particular build/test/compilation? I think that most people who are hitting this are doing something unusual and know it and just need some way to prevent the damage.

@ianlancetaylor
Copy link
Contributor

I'm not going to claim that this is the best possible solution, but I just run go clean -cache regularly. In other words 1) I know that I am doing something unusual; 2) I have a way to prevent the damage.

@mvdan
Copy link
Member

mvdan commented Jan 29, 2020

Those of you who do "unusual things", what sizes do you encounter that are a problem? I can't remember the last time I cleaned GOCACHE, and it's currently sitting at 1.4GiB. Given that my SSD is 140GiB and I only use ~40% of it, that seems fine.

@ALTree
Copy link
Member

ALTree commented Jan 29, 2020

I got a 200GB gocache folder once, after 12hrs of compiler fuzzing.

@josharian
Copy link
Contributor

@mvdan this morning it clocked in at a little over 250GB. In the past I’ve hit 400GB. I’d probably have hit that last night except my script died because I ran out of disk space.

@ianlancetaylor I see you did there. :) To expand on why that isn’t a suitable solution for me:

In normal use, everything is fine. This only happens to me when I start an overnight computation, like compiling 40 different toolchain commits for every platform.

I’d have to run go clean -cache in the middle of the night. I could start a script to do that, but that is an easy thing to forget. I could set up a cron to always clear it every hour, but that would slow down my non-Go-toolchain work, which is a higher priority. Or I could have the script doing the work clean the cache, except that this is a script I’m making publicly available (compilecmp), which means I’ll be clearing other people’s’ caches, which seems unfriendly.

I guess I could have my script create a temp dir, set GOCACHE to it, and clear it regularly. I’ll try that.

@josharian
Copy link
Contributor

Another reason to want to be able to disable the cache in these circumstances is to avoid the wear and tear on my SSD of writing and then immediately deleting 100s of GB. I could set up a RAM disk, except that that is fiddly and platform-specific, and I’m trying to maintain a tool to be used by not-just-me.

@mvdan
Copy link
Member

mvdan commented Jan 29, 2020

Your workflows are definitely heavier than mine :) is there a way to expose this "no GOCACHE" mode only to advanced users, so that we don't encourage the broader community to turn off the cache in general? Perhaps hide it behind an undocumented flag?

@josharian
Copy link
Contributor

One idea: -a already asks cmd/go to ignore the cache; maybe it could also not write the new entries?

I'm not sure I fully understand why there's so much concern about people disabling the cache. We've forced it on everyone for long enough that we should be past the FUD. And if people really want to waste resources, that's their business. And for the folks who have a genuine need to disable the cache, they can.

@mdempsky
Copy link
Member

I just ran into this issue. I've known about the Go build cache, but I assumed it was managed automatically by cmd/go, or that it would warn me when manual intervention is required (like how git auto-GCs or whatever, but occasionally nags you into doing thtat manually).

FWIW, running "git clean -cache" ended up freeing 212GB of my workstation's 730GB disk.

@ALTree
Copy link
Member

ALTree commented Jan 27, 2021

@mdempsky too much defer fuzzing? :^) One workaround I've adopted for compiler fuzzing: in the driver, invoking go tool compile on the .go file (and go tool link on the resulting .o if you need to run the program) does not cause any write to the cache (unlike a go build or go run invocation).

@firelizzard18
Copy link
Contributor

I don't think I'm doing anything special (just working on a ~100kLOC project) but I regularly run out of space. It takes a month or two, but I have a 240GB disk with ~100GB free and the go cache will eat all of that eventually. I could set something up but it's pretty obnoxious that Go eats all my available disk space.

@firelizzard18
Copy link
Contributor

I personally don't know how easy it would be to control "does the disk have enough space left" in a sane and portable way. I was simply pointing out that since the current eviction algorithm takes into account timestamps, perhaps it should also use some data from the filesystem or disk. If we can make that work, we could avoid adding more knobs to the go tool.

@mvdan Maybe an environment variable for "Limit the go cache to X% of the disk"? A default value of 10% or maybe even 2% would be fine on the vast majority of systems. Having more than 10% of my disk consumed by a build cache is not ideal. For seriously space constrained systems, add a note to the documentation, "If you build go on a very small disk you may want to increase this limit".

@mvdan
Copy link
Member

mvdan commented Jul 3, 2022

@firelizzard18 it's surprising that you say it takes a month or two for your cache to fill your disk. Cache entries get cleared at one-day intervals if they are older than five days, as you can see at:

// Time constants for cache expiration.
//
// We set the mtime on a cache file on each use, but at most one per mtimeInterval (1 hour),
// to avoid causing many unnecessary inode updates. The mtimes therefore
// roughly reflect "time of last use" but may in fact be older by at most an hour.
//
// We scan the cache for entries to delete at most once per trimInterval (1 day).
//
// When we do scan the cache, we delete entries that have not been used for
// at least trimLimit (5 days). Statistics gathered from a month of usage by
// Go developers found that essentially all reuse of cached entries happened
// within 5 days of the previous reuse. See golang.org/issue/22990.
const (
mtimeInterval = 1 * time.Hour
trimInterval = 24 * time.Hour
trimLimit = 5 * 24 * time.Hour
)

So you would have to build tons of different Go packages within a few days to realistically grow your build cache to hundreds of gigabytes. That's the kind of stress that @ALTree does while fuzzing, or @josharian while compiling many toolchain versions for many platforms.

The only other scenario we're aware of is small disk sizes, like the author's 9GiB USB stick.

My personal take is that, if the build cache sizes are often a problem while stressing the Go toolchain with tons of different builds, we should start by providing those advanced users with a way to disable build caching entirely, like @josharian's suggestion in #29561 (comment). I don't think any regular users would try to use -a as a way to get the old GOCACHE=off back.

@firelizzard18
Copy link
Contributor

@mvdan I wiped my cache (go clean -cache) 18 days ago (when I made my first comment here). Since then, ~/.cache/go-build has grown to 85G. I have primarily been working on Accumulate plus some occasional work on (my forks of) Tendermint and golangci-lint. I haven't been doing anything crazy.

@mvdan
Copy link
Member

mvdan commented Jul 20, 2022

Can you look into what's the oldest file in that directory tree, by modified time? None of the files should be older than five or six days.

@firelizzard18
Copy link
Contributor

firelizzard18 commented Jul 20, 2022

❯ find ~/.cache/go-build/*/ -type f -printf '%Ab %Ad %AY\n' | sort | uniq -c
   8628 Jul 15 2022
  44935 Jul 18 2022
  36235 Jul 19 2022
  82237 Jul 20 2022

Looks like the oldest file is 5 days old. I'm copying everything onto another drive so I can clear my main drive.

@seankhliao
Copy link
Member

is it some ide driven process that constantly compiles/tests?

@firelizzard18
Copy link
Contributor

@seankhliao VSCode runs golangci-lint whenever I save a file. If I make a change that touches a lot of the codebase, my CPU starts chugging probably because golangci-lint needs to recompile a bunch of stuff. I run tests frequently, but not automatically.

@mvdan
Copy link
Member

mvdan commented Jul 21, 2022

That's probably why your build cache fills with tens of gigabytes within five days. Running static analysis at every file save will rebuild any modified packages and their transitive dependents, adding new entries to the build cache each time. So you're likely increasing the size of your build cache in the order of megabytes for five days each time you save.

We should still do better, but what you're seeing seems to be in line with the current intended behavior.

@firelizzard18
Copy link
Contributor

IMO a build cache should not consume more than 10% of the space on my PC, and maybe not even that much. It makes sense to allow the build cache to consume a lot more of the disk for a dedicated builder. I think a max disk space percentage environment variable (configurable with go env -w) with a sane default value is a good solution.

@rsc
Copy link
Contributor

rsc commented Sep 23, 2022

The main benefit of the current algorithm is simplicity: we just scan the tree for files that haven't been used in 5 days and delete them. We could drop the number of days, but the idea was to avoid a slow start after a long weekend. It is also possible to do incrementally and requires no global state, locks, or synchronization between different instances of the go command. It also automatically sizes to the working set rather than using a fixed amount of disk or thrashing in too small an amount of disk. I'm happy to consider other algorithms, but they need these properties too.

Here is one possibility. The cache is basically a mark-sweep GC for old files, using the mtime as an auto-expiring mark bit to keep a file from being reclaimed for a while. We could instead use a generational GC, with a young generation and an old generation. If the cache is filling up with files that are used for a single build or only for a few minutes and then never used again, we could collect the young generation more aggressively. Specifically, we could keep a young/old bit in the mtime as well, perhaps in the low order bits. Every time we write a new cache file, set its mtime to time.Now.Truncate(time.Minute), so that it ends in :00 (0 seconds). If it gets used after an hour, we set the updated mtime to end in :30. Then the sweep can distinguish young vs old files and delete young files more aggressively.

If someone has a usage pattern that creates a lot of cache and wants to try it out, patch in https://go.dev/cl/433295.

@gopherbot
Copy link

Change https://go.dev/cl/433295 mentions this issue: cmd/go/internal/cache: implement generational GC for GOCACHE

@firelizzard18
Copy link
Contributor

I recently reinstalled Linux on a larger drive. I'll check what my cache looks like at the end of next week, then switch to https://go.dev/cl/433295 and see what it looks like the week after that.

@firelizzard18
Copy link
Contributor

I don't know if my usage has changed or there's something weird about my previous environment but I haven't seen the cache go over 10 GB since

@4a6f656c
Copy link
Contributor Author

Unfortunately, this issue still exists - I've run into this multiple times in the last couple of weeks while building and debugging Go on a 16GB partition:

[joel@fox ~/src/go/src 1433]$ ../bin/go test -c runtime -tags debuglog
...
/home: write failed, file system is full
/home: write failed, file system is full
/home: write failed, file system is full
/home: write failed, file system is full
go: no such tool "vet"
go: no such tool "vet"
go: no such tool "vet"
[joel@fox ~/src/go/src 1432]$ df -h .
Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/sd0g     15.5G   14.7G      0B   100%    /home
[joel@fox ~/src/go/src 1431]$ du -csh /home/joel/.cache/go-build/                                                                                                                                                            
11.8G   /home/joel/.cache/go-build/
11.8G   total
[joel@fox ~/src/go/src 1433]$ go clean -cache
[joel@fox ~/src/go/src 1434]$ df -h .
Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/sd0g     15.5G    2.9G   11.8G    20%    /home

@kaber2
Copy link

kaber2 commented Feb 12, 2024

Same problem here, the cache regulary grows to >50GB and fills up my partition. I usually don't notice until things (including completely unrelated things) start failing.

I have to wonder, I have plenty of caches in ~/.cache, including ccache, node, bazelisk, etc. None of these exhibits this problem, why is Go special?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FeatureRequest GoCommand cmd/go NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests