Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: CtrlZ doesn't work on Windows #17427

Closed
mattn opened this issue Oct 13, 2016 · 13 comments
Closed

os: CtrlZ doesn't work on Windows #17427

mattn opened this issue Oct 13, 2016 · 13 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. OS-Windows
Milestone

Comments

@mattn
Copy link
Member

mattn commented Oct 13, 2016

What version of Go are you using (go version)?

go version devel +1af769d Thu Oct 13 06:16:53 2016 +0000 windows/amd64

What operating system and processor architecture are you using (go env)?

Windows7 64bit

What did you do?

package main

import (
    "io"
    "log"
    "os"
)

func main() {
    _, err := io.Copy(os.Stdout, os.Stdin)
    if err != nil {
        log.Print(err)
    }
}

What did you expect to see?

break blocking of io.Copy.

What did you see instead?

doesn't break blocking of io.Copy.

I suggest to change like below

diff --git a/src/os/file_windows.go b/src/os/file_windows.go
index ed06b55..ad47ee3 100644
--- a/src/os/file_windows.go
+++ b/src/os/file_windows.go
@@ -233,7 +233,7 @@ func (f *File) readOneUTF16FromConsole() (uint16, error) {
            return 0, err
        }
        if nmb == 0 {
-           continue
+           return 0, nil
        }
        mbytes = append(mbytes, buf[0])

@@ -269,7 +269,7 @@ func (f *File) readConsole(buf []byte) (n int, err error) {
        return f.copyReadConsoleBuffer(buf)
    }
    wchar, err := f.readOneUTF16FromConsole()
-   if err != nil {
+   if err != nil || wchar == 0 {
        return 0, err
    }
    r := rune(wchar)

I wonder what way is best for go because below's C code work with different behavior to go.

#include <stdio.h>

int
main(int argc, char* argv[]) {
  char buf[256];
  printf("%d\n", fread(buf, 1, sizeof(buf), stdin));
  perror("fread");
  return 0;
}

C

Go

Microsoft C runtime seeks break fread when typing CTRL-Z in the beigin of the line. But my patch break even though typing CTRL-Z at end of the line.

I don't say that Go have to work as same as C. But I worry about the right way for go.

related issue #17097, #16857

@mikioh mikioh changed the title CtrlZ doesn't work on Windows os: CtrlZ doesn't work on Windows Oct 13, 2016
@alexbrainman
Copy link
Member

Sorry for breaking it. Need a test for it, so we don't break it again,

I will have a look into it when I have time. Or feel free to sen the change (with test), if you like. Thank you.

Alex

@mikioh mikioh added this to the Go1.8 milestone Oct 14, 2016
@gopherbot
Copy link

CL https://golang.org/cl/31114 mentions this issue.

@alexbrainman
Copy link
Member

I looked at CL 31114 and I am not sure new behavior (the way you convert Ctrl+Z into bytes/errors returned by os.Stdin,Read) in CL is correct. @rsc approved original CtrlZ change (CL 4310), so he, probably, knows what we want here.

I run your program above with this small change:

diff --git a/src/os/file_windows.go b/src/os/file_windows.go
index ed06b55..d94bd9b 100644
--- a/src/os/file_windows.go
+++ b/src/os/file_windows.go
@@ -229,6 +229,7 @@ func (f *File) readOneUTF16FromConsole() (uint16, error) {
    for {
        var nmb uint32
        err := readFile(f.fd, buf[:], &nmb, nil)
+       println("nmb=", nmb, " buf[0]=", buf[0], " err=", err)
        if err != nil {
            return 0, err
        }

Then I pressed:

"abc" and Enter
CtrlZ and Enter
"abc" and CtrlZ and "def" and Enter

I see this output:

C:\dev\src\issues\issue17427>go run main.go
abc
nmb= 1  buf[0]= 97  err= (0x0,0x0)
anmb= 1  buf[0]= 98  err= (0x0,0x0)
bnmb= 1  buf[0]= 99  err= (0x0,0x0)
cnmb= 1  buf[0]= 13  err= (0x0,0x0)
nmb= 1  buf[0]= 10  err= (0x0,0x0)

^Z
nmb= 0  buf[0]= 26  err= (0x0,0x0)
nmb= 1  buf[0]= 13  err= (0x0,0x0)
nmb= 1  buf[0]= 10  err= (0x0,0x0)

abc^Zdef
nmb= 1  buf[0]= 97  err= (0x0,0x0)
anmb= 1  buf[0]= 98  err= (0x0,0x0)
bnmb= 1  buf[0]= 99  err= (0x0,0x0)
cnmb= 0  buf[0]= 26  err= (0x0,0x0)
nmb= 1  buf[0]= 100  err= (0x0,0x0)
dnmb= 1  buf[0]= 101  err= (0x0,0x0)
enmb= 1  buf[0]= 102  err= (0x0,0x0)
fnmb= 1  buf[0]= 13  err= (0x0,0x0)
nmb= 1  buf[0]= 10  err= (0x0,0x0)

Russ, what whould new code for os.Stdin.Read return here? Assume we have changed the program above to loop forever instead of exiting after io.Copy returns. Should os.Stdin.Read return more data after it returns io.EOF? Should os.Stdin.Read return 26 when we press CtrlZ?

Thank you.

Alex

@rsc
Copy link
Contributor

rsc commented Oct 18, 2016

I propose that the first ^Z return 0, io.EOF and also set a bit so that all future reads return 0, io.EOF without reading any more data from the console. That way if you are at a Go program and type

abc^Zdef

then the program exits (one assumes because it saw EOF) with def still remaining in the console buffer for reading by the next program that checks.

@mattn
Copy link
Member Author

mattn commented Oct 18, 2016

Totally okay to me. And I propose that return 0, io.EOF for each typing ^Z.

io.Copy(os.Stdout, os.Stdin)
io.Copy(os.Stdout, os.Stdin)

This should be possible to read twice.

@rsc
Copy link
Contributor

rsc commented Oct 19, 2016

@mattn is right, and I was wrong: each ^Z should result in 0, io.EOF coming back, but then more reading should continue to read from the console. This matches what happens today on Unix with ^D on consoles

package main

import (
    "fmt"
    "os"
)

func main() {
    p := make([]byte, 100)
    fmt.Printf("> ")
    for {
        n, err := os.Stdin.Read(p)
        fmt.Printf("[%d %q %v]\n> ", n, p[:n], err)
    }
}
$ go run /tmp/x.go
> hello
[6 "hello\n" <nil>]
> ^D[0 "" EOF]
> hello ^D[6 "hello " <nil>]
> bye^D[3 "bye" <nil>]
> ^D[0 "" EOF]
> ^D[0 "" EOF]
> ^D[0 "" EOF]
> ^D[0 "" EOF]
> 

@alexbrainman
Copy link
Member

I am fine with whatever you decide here. @mattn, please, make appropriate changes to CL 31114 and will close this. Thank you.

Alex

@mattn
Copy link
Member Author

mattn commented Oct 21, 2016

I'll take a look. Currently, I found two problems.

  1. os.Stdin.Read doesn't fill buf.
  2. CTRL-Z doesn't work

The first is caused that readConsole return immediately if it finish the convertion of mbytes.

> hello
[1 "h" <nil>]
[1 "e" <nil>]
[1 "l" <nil>]
[1 "l" <nil>]
[1 "o" <nil>]
[1 "\r" <nil>]
[1 "\n" <nil>]

I'm guessing that we may need to call readFile with large bytes ex 4096 (BUFSIZ). Or another idea, readConsole handle \n for the break of input. Second way should check whether GetConsoleMode() return ENABLE_LINE_INPUT.

@quentinmit quentinmit added the NeedsFix The path to resolution is known, but the work has not been done. label Oct 21, 2016
@mattn
Copy link
Member Author

mattn commented Oct 22, 2016

Current implementation seems not work for multi-byte string.

Passing [1]byte into ReadFile break the trailing byte of double byte character set. For example, Typing in cp932 (0x82 0xA0) and Enter (0x13, 0x10), This readFile return 0x82, 0x13, 0x10. Now I concluded that we can't fix this issue with current way to implement. However, I don't like to allocate memory to store utf-8 bytes which have same number of characters in DBCS. I want to ask you which way is best.

  • Allocate x3 bytes for utf-8, and call readFile(), convert with MultiByteToWideChar(cp)/WideCharToMultiByte(CP_UTF8).
  • Revert implementation of readFile and use ReadConsole. Then handle CTRL-Z especially.

Please give me your opinion.

@mattn
Copy link
Member Author

mattn commented Oct 22, 2016

2649868#diff-558a93088b8a6149f759cc86274f67d7R255

Implementation using ReadConsole allocate []int16 which have same number of characters. So I prefer the first way.

@mattn
Copy link
Member Author

mattn commented Oct 22, 2016

@alexbrainman I'll update CL 31114 and revert implementation last change. And I'll add bits trick to read 4 bytes which is minimum size of multi-byte. It will be bigger than len(b). But the readbuf will be read at next call of readConsole. The minimum size of multi-byte should be 4 because Windows can change current code page with chcp 65001 (CP_UTF8).

func (f *File) readConsole(b []byte) (n int, err error) {
    if len(b) == 0 {
        return 0, nil
    }
    if len(f.readbuf) == 0 {
        numBytes := len(b)
        // Windows  can't read bytes over max of int16.
        // Some versions of Windows can read even less.
        // See golang.org/issue/13697.
        if numBytes > 10000 {
            numBytes = 10000
        } else if numBytes < 4 {
            // Read minimum size of multi-byte 
            numBytes = 4
        }
        mbytes := make([]byte, numBytes)
        var nmb uint32
        err := syscall.ReadFile(f.fd, mbytes, &nmb, nil)
        if err != nil {
            return 0, err
        }
        if nmb > 0 {
            var pmb *byte
            if len(b) > 0 {
                pmb = &mbytes[0]
            }
            ccp := windows.GetConsoleCP()
            // Convert from 8-bit console encoding to UTF16.
            // MultiByteToWideChar defaults to Unicode NFC form, which is the expected one.
            nwc, err := windows.MultiByteToWideChar(ccp, 0, pmb, int32(nmb), nil, 0)
            if err != nil {
                return 0, err
            }
            wchars := make([]uint16, nwc)
            pwc := &wchars[0]
            nwc, err = windows.MultiByteToWideChar(ccp, 0, pmb, int32(nmb), pwc, nwc)
            if err != nil {
                return 0, err
            }
            f.readbuf = utf16.Decode(wchars[:nwc])
        }
    }
    for i, r := range f.readbuf {
        if utf8.RuneLen(r) > len(b) {
            f.readbuf = f.readbuf[i:]
            return n, nil
        }
        nr := utf8.EncodeRune(b, r)
        b = b[nr:]
        n += nr
    }
    f.readbuf = nil
    return n, nil
}

@mattn
Copy link
Member Author

mattn commented Oct 22, 2016

I found a bug of ReadFile on console. See following code.

#include <windows.h>
#include <stdio.h>

int
main(int argc, char* argv[]) {
  HANDLE h = GetStdHandle(STD_INPUT_HANDLE);
  char buf[7];
  DWORD nread;
  BOOL result;

  result = ReadFile(h, buf, sizeof(buf)-1, &nread, NULL);
  printf("result=%d, nread=%d\n", result, nread);
  buf[nread] = 0;
  puts(buf);

  result = ReadFile(h, buf, sizeof(buf)-1, &nread, NULL);
  printf("result=%d, nread=%d\n", result, nread);
  buf[nread] = 0;
  puts(buf);

  return 0;
}

When typing あいうえ in cp932 (8bytes), second call of ReadFile should read remaining (2 bytes). But nothing to read. This doesn't occur when input ASCII bytes only. So I updated CL 31114 that read bytes with enough bytes.

@gopherbot
Copy link

CL https://golang.org/cl/33451 mentions this issue.

@golang golang locked and limited conversation to collaborators Nov 29, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. OS-Windows
Projects
None yet
Development

No branches or pull requests

6 participants