Skip to content

os/exec: automatic encoding conversion for stdout/stderr on Windows #69709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lime2008 opened this issue Sep 30, 2024 · 7 comments
Closed

os/exec: automatic encoding conversion for stdout/stderr on Windows #69709

lime2008 opened this issue Sep 30, 2024 · 7 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows

Comments

@lime2008
Copy link

lime2008 commented Sep 30, 2024

Proposal: os/exec: Handle Windows Standard Streams Encoding

Currently, the os/exec package does not differentiate between the default behavior of different Windows versions regarding standard output and error encoding. This can lead to encoding issues when running commands that output non-UTF-8 characters in Windows with the "Use Unicode UTF-8 for worldwide language support" beta feature enabled.

Problem:

  • Windows traditionally uses locale-specific encodings for console output (e.g., gbk in China).
  • Windows introduced a beta feature to enforce UTF-8 encoding for all console applications.
  • os/exec does not account for this difference while getting stdout, potentially leading to garbled output when this beta feature is not enabled.

Proposed Solution:

Introduce a mechanism in os/exec to handle Windows console encoding variations, specifically:

  1. Detection: Automatically detect if the Windows UTF-8 beta feature is active.
  2. Transparent Handling: Based on the chosen configuration, seamlessly handle encoding and decoding of output streams within os/exec.

Benefits:

  • Code can be more consistent by removing the need for platform-specific encoding workarounds.
  • Improved compatibility with applications relying on UTF-8 output.
  • Enhanced user experience by avoiding garbled characters in console output.

This feature request aims to improve the reliability and user-friendliness of os/exec when interacting with console applications in diverse Windows environments. Thank you to everyone who reviews this request.

@gopherbot gopherbot added this to the Proposal milestone Sep 30, 2024
@qmuntal qmuntal removed the Proposal label Sep 30, 2024
@qmuntal qmuntal changed the title proposal: os/exec: Handle Windows standard streams Encoding os/exec: Handle Windows standard streams Encoding Sep 30, 2024
@qmuntal
Copy link
Member

qmuntal commented Sep 30, 2024

Thanks for reporting this issue @lime2008. I'm moving this out of the proposal process given that you are suggesting that it can be fixed without adding new APIs. Let's treat this a bug fix rather than a proposal for now.

Could you provide some more detailed steps for reproducing this issue on my PC?

@seankhliao seankhliao removed this from the Proposal milestone Sep 30, 2024
@qmuntal qmuntal added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 30, 2024
@lime2008
Copy link
Author

Below is a script that demonstrates the problem:

package main

import (
    "fmt"
    "log"
    "os/exec"
)

func main() {
    cmdRunner := exec.Command("cmd", "/C", "echo 中文字符测试 any chinese character test")
    output, err := cmdRunner.Output()
    if err != nil {
        log.Printf("Error executing command: %s", err)
        output = []byte(fmt.Sprintf("Error: %s", err.Error()))
    }
    //utf8Result, err := EnsureUTF8(output)
    utf8Result := output
    fmt.Print(string(utf8Result))

}

and it will make the output like:
image

@lime2008
Copy link
Author

after enabling the settings(sorry my system interface is in Chinese.)
it's in Settings -> Time & language -> Language & region
1122cedb0dc7d2e54d36f933d89e75b8
and then it will output normal characters like
2da3e6ee4baea04b82b8ca0b265c546c
Thanks for the review.

@seankhliao seankhliao changed the title os/exec: Handle Windows standard streams Encoding os/exec: automatic encoding conversion for stdout/stderr on Windows Sep 30, 2024
@seankhliao
Copy link
Member

I'm concerned that os/exec currently treats output as a stream of bytes, not necessarily text that has an encoding that needs to be translated. What happens when transparent handling tries to convert binary data?

@lime2008
Copy link
Author

lime2008 commented Sep 30, 2024

How about add an interface to explicitly convert []byte to str based on the system character set? Or give a param to decide convert or not
Consider in python there is also a mode r to auto decode using system config and rb for raw bytes.

@seankhliao
Copy link
Member

since most text encoding implementations are outside of the standard library, it seems any converter would be too.
you could wrap your reader in one from https://pkg.go.dev/golang.org/x/text/encoding#Decoder.Bytes

@lime2008
Copy link
Author

lime2008 commented Oct 8, 2024

I apologize for the delay in replying you.
Thanks again for your time and assistance!

@lime2008 lime2008 closed this as completed Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows
Projects
None yet
Development

No branches or pull requests

4 participants