Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: atomicity guarantees of os.File.Write #49877

Closed
motiejus opened this issue Nov 30, 2021 · 10 comments
Closed

os: atomicity guarantees of os.File.Write #49877

motiejus opened this issue Nov 30, 2021 · 10 comments
Labels
Documentation FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@motiejus
Copy link
Contributor

motiejus commented Nov 30, 2021

What version of Go are you using (go version)?

go1.17beta1-2573-g18934e11ba

What did you see?

Documentation on os#File.Write does not mention that it is atomic (that is, multiple Write calls will not end up with interleaved output). However, it seems to be atomic (tracing from os.File#Write to poll.FD#Write is explained in a later comment):

motiejus ~/code/go/src/internal/poll $ git grep -w -A3 "func.*\<Write"
fd_plan9.go:func (fd *FD) Write(fn func([]byte) (int, error), b []byte) (int, error) {
fd_plan9.go-    if err := fd.writeLock(); err != nil {
fd_plan9.go-            return 0, err
fd_plan9.go-    }
--
fd_unix.go:func (fd *FD) Write(p []byte) (int, error) {
fd_unix.go-     if err := fd.writeLock(); err != nil {
fd_unix.go-             return 0, err
fd_unix.go-     }
--
fd_windows.go:func (fd *FD) Write(buf []byte) (int, error) {
fd_windows.go-  if err := fd.writeLock(); err != nil {
fd_windows.go-          return 0, err
fd_windows.go-  }

As this is not described in the official documentation, it leads to misunderstandings such as this:

<...> In particular, *os.Files must be locked before use.

... and excessive locking: all writes in Zap are thus additionally protected via a mutex. It would make sense to remove the lock. Removing this lock would be easier if underlying function were documented to be safe for concurrent use, and guaranteed to not interleave output.

What did you expect to see?

Explanation that concurrent use of os.File#Write will not interleave output.

Am I missing a case where it isn't and there is a risk of interleaved output? If you believe it makes sense to write it down (and downstreams to rely on), I will submit a PR with a documentation change.

@seankhliao seankhliao changed the title [documentation] atomicity guarantees of os.File#Write os: atomicity guarantees of os.File.Write Nov 30, 2021
@robpike
Copy link
Contributor

robpike commented Nov 30, 2021

This depends on the operating system that actually implements the write system call. They might all guarantee atomicity. I suspect not, although in practice it is likely to be hard to see the property fail.

Note that none of the hits from grep above are in the os package.

@motiejus
Copy link
Contributor Author

motiejus commented Nov 30, 2021

This depends on the operating system that actually implements the write system call. They might all guarantee atomicity. I suspect not, although in practice it is likely to be hard to see the property fail.

Whether the system provides it or it doesn't is a tangential issue; this one is about the golang's current runtime guarantees. Once the docs are clarified (and the runtime contract is documented), I am planning to look into at least the Linux version whether that locking still makes sense. But that's for later.

Note that none of the hits from grep above are in the os package.

os package ends up in the poll package, I skipped tracing for brewity. Here is the call path for non-plan9:

os.File#Write -> os.File#write -> poll.FD#Write

... and that's how we end up in the poll package. Does that explain the grep?

Note: plan9 calls the plain syscall. plan9's write(2) sort-of mentions "multiprocess programs without interference". It's not a proof it works as I would expect it to, but is a positive sign for a deeper investigation.

@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Nov 30, 2021
@mknyszek mknyszek added this to the Backlog milestone Nov 30, 2021
@mknyszek
Copy link
Contributor

@prattmic
Copy link
Member

prattmic commented Nov 30, 2021

This depends on the operating system that actually implements the write system call. They might all guarantee atomicity. I suspect not, although in practice it is likely to be hard to see the property fail.

Whether the system provides it or it doesn't is a tangential issue; this one is about the golang's current runtime guarantees. Once the docs are clarified (and the runtime contract is documented).

Could you clarify what you are looking for? It sounds like you are looking for a documented contract something like "Write may be called concurrently and will perform the entire write either before or after concurrent calls."

I believe that this statement is true as I've written it because of the write locking in internal/poll (only double-checked on unix), but we may make multiple system calls, so we cannot guarantee atomicity from the perspective of other processes which may be relevant in some cases (e.g., on Linux, writes to a pipe are atomic only up to 4096 bytes).

@motiejus
Copy link
Contributor Author

motiejus commented Nov 30, 2021

Could you clarify what you are looking for? It sounds like you are looking for a documented contract something like "Write may be called concurrently and will perform the entire write either before or after concurrent calls."
I believe that this statement is true as I've written it because of the write locking in internal/poll (only double-checked on unix),

Yes, this is exactly what I am trying to get documented. Thank you for phrasing it so succinctly.

but we may make multiple system calls, so we cannot guarantee atomicity from the perspective of other processes which may be relevant in some cases (e.g., on Linux, writes to a pipe are atomic only up to 4096 bytes).

Which means the "tangential issue above" may not be true. But let's keep the scope to the original issue (the contract of the code the way it is today) over here.

@slrz
Copy link

slrz commented Dec 1, 2021

... and excessive locking: all writes in Zap are thus additionally protected via a mutex. It would make sense to remove the lock. Removing this lock would be easier if underlying function were documented to be safe for concurrent use, and guaranteed to not interleave output.

Even if there was such a guarantee for os.File, Zap appears to accept any old io.Writer here. Those lack not only the promise of not interleaving output, they may not even be safe to invoke concurrently at all without the risk of corrupting internal state.

I guess it's similar for a lot of these cases, although eliding the lock might still be possible for the specific case of os.File (by type-asserting).

@abhinav
Copy link
Contributor

abhinav commented Dec 1, 2021

That's valid, @slrz.
The guarantee would allow us to update the recommendation in Zap's documentation.
Per zapcore.Lock:

Lock wraps a WriteSyncer in a mutex to make it safe for concurrent use. In particular, *os.Files must be locked before use.

If os.File guarantees that a single Write([]byte) will not be interleaved with another concurrent write on the same os.File, no matter how large the []byte, then the recommendation above and the corresponding unnecessary locks can be dropped.

@bcmills
Copy link
Contributor

bcmills commented Dec 1, 2021

I believe that this statement is true as I've written it because of the write locking in internal/poll (only double-checked on unix), but we may make multiple system calls, so we cannot guarantee atomicity from the perspective of other processes which may be relevant in some cases (e.g., on Linux, writes to a pipe are atomic only up to 4096 bytes).

I think that is correct. (Compare #38751 (comment), in which we discovered that the (*FD).Write calls are locked but may result in multiple write syscalls.)

It may also be possible for write syscalls to be interleaved for file descriptors that refer to the same underlying open file description (for example, if they are aliased by a dup or dup2 syscall), even within a single process.

@ianlancetaylor
Copy link
Contributor

Previous discussion at https://golang.org/cl/174957.

As others have noted, there are various (unlikely) cases where writes can be interleaved. Another case I haven't seen mentioned is that concurrent calls to the WriteAt method can be interleaved.

Although the current implementation takes a write lock, it's not obvious that we want to constrain all future implementations to take a write lock, given that the write lock is not in itself enough to absolutely guarantee that writes are never interleaved.

So I don't think we should document anything here.

@motiejus
Copy link
Contributor Author

motiejus commented Dec 8, 2021

Thanks for pointing to the original issue.

@motiejus motiejus closed this as completed Dec 8, 2021
@golang golang locked and limited conversation to collaborators Dec 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Documentation FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

9 participants