Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bufio: race condition when misusing ReadSlice #4977

Closed
gopherbot opened this issue Mar 4, 2013 · 8 comments
Closed

bufio: race condition when misusing ReadSlice #4977

gopherbot opened this issue Mar 4, 2013 · 8 comments

Comments

@gopherbot
Copy link

by striginsv@yandex.ru:

There is a problem using []byte channel when processing large amount of data.
$ wc -l bigfile.txt
3601800 bigfile.txt

package main

import (
    "bufio"
    "fmt"
    "os"
    "runtime"
)

var ch chan []byte

func reader(fname string, ch chan []byte) {
    f, err := os.Open(fname)
    if err != nil {
        return
    }
    defer f.Close()
    r := bufio.NewReader(f)

    for {
        s, ok := r.ReadSlice('\n')
        if ok != nil {
            close(ch)
            break
        }
        ch <- (s)
    }
}

func main() {
    runtime.GOMAXPROCS(runtime.NumCPU())

    ch := make(chan []byte)

    if len(os.Args) != 2 {
        fmt.Printf("Usage: %s <filename>\n", os.Args[0])
        return
    }
    fname := os.Args[1]

    go reader(fname, ch)
    // num := 0
    for s := range ch {
        _ = s
        // num++
        fmt.Print(string(s))
    }
    // fmt.Println("Lines number :", num)
}

$ go run counter.go | wc -l
3601787
But if we uncomment counter, the output will be right:
Lines number : 3601800

The problem disappears if we change channel type to string, or disable GOMAXPROCS.
@remyoudompheng
Copy link
Contributor

Comment 1:

Your program has a race condition: it is forbidden to read from s (in fmt.Print) while
it is written to. The call to ReadSlice will write to s, giving incorrect results.
This explains your problem.

Labels changed: removed priority-triage, go1.1maybe.

Status changed to WorkingAsIntended.

@gopherbot
Copy link
Author

Comment 2 by striginsv@yandex.ru:

Thank you for replying!
But i'm doing it through channel, which should block simultaneous read/write, as i
understand.

@rsc
Copy link
Contributor

rsc commented Mar 4, 2013

Comment 3:

After the first send completes, you have one goroutine doing string(buf)
and another goroutine doing a second call to ReadSlice at the same time.
Russ

@gopherbot
Copy link
Author

Comment 4 by striginsv@yandex.ru:

I understand my mistake) When we use []byte, we send pointer to our slice on channel.
Thank you.

@dvyukov
Copy link
Member

dvyukov commented Mar 5, 2013

Comment 5:

If you build the program with -race flag on tip (go build -race), you will see the
following data race report:
WARNING: DATA RACE
Read by goroutine 1:
  runtime.slicebytetostring()
      src/pkg/runtime/zstring_darwin_amd64.c:289 +0x0
  main.main()
      /tmp/race.go:46 +0x339
  runtime.main()
      src/pkg/runtime/proc.c:180 +0xa0
Previous write by goroutine 4:
  runtime.copy()
      src/pkg/runtime/slice.c:237 +0x0
  bufio.(*Reader).fill()
      src/pkg/bufio/bufio.go:73 +0x13b
  bufio.(*Reader).ReadSlice()
      src/pkg/bufio/bufio.go:262 +0x53c
  main.reader()
      /tmp/race.go:21 +0xcf
  gosched0()
      src/pkg/runtime/proc.c:1153 +0x9f
Goroutine 1 (running) created at:
  _rt0_amd64()
      src/pkg/runtime/asm_amd64.s:74 +0xdb
Goroutine 4 (running) created at:
  main.main()
      /tmp/race.go:41 +0x201
  runtime.main()
      src/pkg/runtime/proc.c:180 +0xa0

@dvyukov
Copy link
Member

dvyukov commented Mar 5, 2013

Comment 6:

The GC collects ~1000 objects, but they seem to be small. I think the
GC happens after allocation of a huge object (few GBs), soon after the
previous GC.
Looking at these records:
gc37(8): 3+0+0 ms, 6523 -> 6523 MB 9490 -> 9489 (68285-58796) objects,
65(2826) handoff, 38(541) steal, 667/411/122 yields
gc38(8): 2+0+0 ms, 14715 -> 14715 MB 9490 -> 9490 (68287-58797)
objects, 100(5268) handoff, 39(743) steal, 809/426/86 yields
it seems that the GC happens after allocation of 8192MB object. Can it
be exactly the bytes.Buffer that we've seen in the stack trace?

@dvyukov
Copy link
Member

dvyukov commented Mar 5, 2013

Comment 7:

Sorry, wrong issue.

@dvyukov
Copy link
Member

dvyukov commented Mar 11, 2013

Comment 8:

Labels changed: added threadsanitizer.

@golang golang locked and limited conversation to collaborators Jun 24, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants