Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os,fmt: Documentation should mention that os.Stdout, and fmt.Print* functions are unbuffered #36619

Open
kohlrabi opened this issue Jan 17, 2020 · 7 comments
Labels
Documentation NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@kohlrabi
Copy link

kohlrabi commented Jan 17, 2020

What version of Go are you using (go version)?

$ go version
go version go1.13.5 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/terasa/.cache/go-build"
GOENV="/home/terasa/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/terasa/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go-1.13"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go-1.13/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build094455032=/tmp/go-build -gno-record-gcc-switches"
GOROOT/bin/go version: go version go1.13.5 linux/amd64
GOROOT/bin/go tool compile -V: compile version go1.13.5
uname -sr: Linux 5.3.0-3-amd64
Distributor ID:	Debian
Description:	Debian GNU/Linux bullseye/sid
Release:	testing
Codename:	bullseye
/lib/x86_64-linux-gnu/libc.so.6: GNU C Library (Debian GLIBC 2.29-7) stable release version 2.29.
gdb --version: GNU gdb (Debian 8.3.1-1) 8.3.1

What did you do?

I just wanted to print formatted elements of an array, in this case I wanted to prepend a 0x to every byte in my array. The idiomatic C routine goes somewhere along these lines:

print.c:

#include <stdio.h>

int main(int argc, char **argv)
{
    int i;
    unsigned char n[100000] = { 0 };

    for(i=0; i<100000; i++) {
        printf(" 0x%02x", n[i]);
    }
    printf("\n");

    return 0;
}

The performance measured by just running time ./print on my machine is

real	0m0.053s
user	0m0.018s
sys	0m0.005s

Naively using the same approach in go lead to disastrous performance:
print_loop.go:

package main

import "fmt"

func main() {
	var n [100000]byte

	for _, v := range n {
		fmt.Printf(" 0x%02x", v)
	}
	fmt.Printf("\n")
}

I go build, and then run the resulting binary with time ./print_loop:

real	0m0.373s
user	0m0.142s
sys	0m0.207s

I realize that printing all elements of a byte array is possible by using "% x" as format string, and passing the array to that completely, like fmt.Printf("0x% x", n), but that is not exactly what I want, since it does not prepend 0x to every byte.

I realized by "chance" and some googling (https://stackoverflow.com/questions/13422381/idiomatically-buffer-os-stdout, https://grokbase.com/t/gg/golang-nuts/158cgb3rfd/go-nuts-os-stdout-is-not-buffered=) that os.Stdout are unbuffered in go, so I immediately tried to use a bufio.Writer, and everything was "fine". This code is obviously much more verbose, and it is of utmost importance to not forget flushing the buffer, since it apparently does not happen automatically on leaving main (and thus exiting):

print_buffer.go:

package main

import (
	"bufio"
	"fmt"
	"os"
)

func main() {
	var n [100000]byte

	bufStdout := bufio.NewWriter(os.Stdout)
	defer bufStdout.Flush()

	for _, v := range n {
		fmt.Fprintf(bufStdout, " 0x%02x", v)
	}
	fmt.Fprintf(bufStdout, "\n")
}

This results in:

real	0m0.046s
user	0m0.025s
sys	0m0.000s

The issue is that neither the documentation on fmt.Print*, nor the one on os.Stdout mention that I/O is unbuffered, and people using idiomatic C-style approaches to printing formatted output are in for a huge surprise. I certainly was dumbfounded as to why my simple go code performed so much worse than the equivalent C code.

What did you expect to see?

I expect the documentation to mention the buffering behaviour of I/O operations for functions in fmt, and for os.Stdin and os.Stdout. The documentation for fmt could also link to the os documentation, i.e. mention explicitly that the unbuffered os.Stdout is used.

What did you see instead?

Nothing.

@toothrot
Copy link
Contributor

@kohlrabi thanks for writing this up.

... I immediately tried to use a bufio.Writer, and everything was "fine". This code is obviously much more verbose, and it is of utmost importance to not forget flushing the buffer, since it apparently does not happen automatically on leaving main (and thus exiting):

The flush behavior is documented for bufio:

After all data has been written, the client should call the Flush method to guarantee all data has been forwarded to the underlying io.Writer.

I think this level of control is consistent with Go's API design of generally requiring the programmer to be explicit, and including tools like defer to make this easier to control.

I'm not sure if I agree on explicitly calling out that *os.File is unbuffered. Not all Go users are coming from C, and so may not have the expectation that fmt.Printf would be buffered.

/cc @ianlancetaylor

@toothrot toothrot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jan 21, 2020
@toothrot toothrot added this to the Backlog milestone Jan 21, 2020
@ianlancetaylor
Copy link
Contributor

I can't see how it makes sense to add any comment about buffering to the fmt package.

For the os package a short sentence or two somewhere might be OK.

@bitsofinfo
Copy link

bitsofinfo commented Oct 1, 2020

agreed wasted hours of time debugging on this...

@Nakilon
Copy link

Nakilon commented Nov 26, 2020

It does not look to work as unbuffered. I launch the Go program as a subprocess with popen from a program written in another language and see it puts \n randomly and even mixes the order. Wasted hours already. I do Println the JSON string and it just gets cut in half so when I read the line it's not full and JSON is invalid. More than that, if I do another Println it gets between two halves of the JSON that was just splitted with no reason.

@ianlancetaylor
Copy link
Contributor

@Nakilon Unbuffered output means that concurrent output from multiple goroutines can be interleaved.

@Nakilon
Copy link

Nakilon commented Nov 26, 2020

@ianlancetaylor ah, ok, I thought it means it gets flushed immediately preventing the mess.

But in my case I don't do anything concurrent, my program is pretty simple. It just accumulates JSON strings and then prints them out when asked. After this edit the issue appeared:

diff --git a/satellite.go b/satellite.go
index baec58d..479e769 100644
--- a/satellite.go
+++ b/satellite.go
@@ -12,7 +12,7 @@ func main() {
   var err error
   var line []byte
   var temp interface{}
-  var array []interface{}
+  var array [][]byte
   stdin := bufio.NewReader(os.Stdin)
   for {
     if c, err = stdin.ReadByte(); err != nil {
@@ -31,13 +31,10 @@ func main() {
       if err = json.Unmarshal(line, &temp); err != nil {
         panic(err)
       }
-      array = append(array, temp)
+      array = append(array, line)
     case 103:   // g print all and die
       fmt.Println(len(array));
-      for _, temp = range array {
-        if line, err = json.Marshal(temp); err != nil {
-          panic(err)
-        }
+      for _, line = range array {
         fmt.Println(string(line));
       }
       os.Exit(0)

Initially I was parsing the JSON strings (for other needs) and storing them in []interface{} but then I decided that I don't need to store them parsed and started store byte sequences in [][]byte right as they come from the stdin := bufio.NewReader(os.Stdin); stdin.ReadLine(). Now it inserts the \n instead of three other bytes (two commas and a digit) in one specific line at the same place whenever I run this. And if I add debug prints it even prints them in between where it put the corrupting \n.
Tried using os.Stdout.Write instead but the result is absolutely the same.
I wonder how to print to stdout in order.

UPD: other observations -- the corrupted line lacks leading two bytes, and it happens only when the array has more than one element.

@ianlancetaylor
Copy link
Contributor

@Nakilon That is best discussed on a forum, not on this issue. See https://golang.org/wiki/Questions. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants