Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

log: allows special characters to be logged #50277

Open
divVerent opened this issue Dec 20, 2021 · 15 comments
Open

log: allows special characters to be logged #50277

divVerent opened this issue Dec 20, 2021 · 15 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@divVerent
Copy link

divVerent commented Dec 20, 2021

What version of Go are you using (go version)?

$ go version
go version go1.17.3 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/rpolzer/.cache/go-build"
GOENV="/home/rpolzer/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/rpolzer/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/rpolzer/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/rpolzer/homedir/src/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/rpolzer/homedir/src/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.17.3"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1757116074=/tmp/go-build -gno-record-gcc-switches"

What did you do?

https://go.dev/play/p/RwvI7bW06_J

What did you expect to see?

2009/11/10 23:00:00 Hello, world
                    Go away, world
2009/11/10 23:00:00 Hello, \b\b\b\b\b\b\bGo away, world
2009/11/10 23:00:00 Hello, world\033[G2022

What did you see instead?

In go.dev/play:

2009/11/10 23:00:00 Hello, world
2009/11/10 23:00:00 Go away, world
2009/11/10 23:00:00 Hello, Go away, world
2009/11/10 23:00:00 Hello, world[G2022

This is indistinguishable from "Go away, world" being a real log message. Can be used by an attacker to e.g. make it look like some other user did something bad, getting them banned by a manual admin or some automated log analysis tool (e.g. fail2ban).

On a terminal when printed to stderr:

2021/12/20 07:56:55 Hello, world
2009/11/10 23:00:00 Go away, world
2021/12/20 07:56:55 Go away, world
2022/12/20 07:56:55 Hello, world

On a terminal in "less":

2021/12/20 08:00:15 Hello, world
2009/11/10 23:00:00 Go away, world
2021/12/20 08:00:15 Go away, world
2021/12/20 08:00:15 Hello, worldESC[G2022

In addition, this let me change/hide an existing log message, and change other data in the log message.

This is clearly all rather minor to e.g. what log4j had ;) but still may be serious in some applications. I suggest hardening the "log" package against this by escaping/replacing special characters in log messages; not sure about best handling of newlines (the indent approach above would be great, but I guess any "unique" formatting is fine there).

More of the same: similar "exploits" may be doable by including Unicode LTR/RTL overrides in the log message; this too should ideally be hardened against.

@divVerent divVerent changed the title affected/package: log "log" allows special characters to be logged Dec 20, 2021
@zx2c4
Copy link
Contributor

zx2c4 commented Dec 20, 2021

Does %q not solve this?

https://go.dev/play/p/i97Z2XmpmcE

@divVerent
Copy link
Author

It absolutely does. So part of a solution could be making REALLY CLEAR to EVERY user of the package that %v and %s should only ever be used for strings that cannot contain user input.

Also, %q cannot be used safely on structs: https://go.dev/play/p/FLoC2-CJSwk, so %#v seems to be the only safe way to log structs with user input.

But yes, I kinda am open to a documentation-only solution to this issue.

@zx2c4
Copy link
Contributor

zx2c4 commented Dec 20, 2021

I suppose what you're really after might be a vis(3)-like thing that's built into a special logger function? (log.SafePrintf("%v", attackerControlled)) Or maybe even a modifier to % that does the right thing, without necessarily adding quotes like %q does? (log.Printf("%^v", attackerControlled))

Of these, I like the second idea better.

But there's also an argument to be made that %q is the only safe way to do this anyway, since otherwise the space character becomes an interpretive hazard. For example https://go.dev/play/p/czlaVxlHTwn

So this becomes a kind of hard thing to specify generally, which I guess is why vis(3) has 8 flags for it and why Go just has %q.

@divVerent
Copy link
Author

divVerent commented Dec 20, 2021

I think we have two problems there.

Problem 1: User input can look like e.g. spaces around log messages. A possible solution could be creating something "between" %#v and %+v" that is safe for structs containing user input and virtually anything else (but prints less "noise" - think of a "%+v that uses %q for strings") and till then to recommend that anyone who logs user input shall use %q, %#v or explicit numeric formats, but never %v or %s. I like your idea of a %^v format for this purpose; sadly it is likely not possible to just make %+v quote strings, as that is likely to break existing code.

Problem 2: Multi-line log messages are never "safe". This could be solved as suggested above - or we could explicitly not solve it but explain in documentation that logging of strings containing newlines is strongly discouraged because the output format becomes ambiguous. In addition, we could add as new Logger flag to "harden" the logger against this - basically, a flag Lhardened that makes Logger.Output sanitize what it prints to the writer. Why I believe this is needed even if Problem 1 is solved is that existing code may already be vulnerable, and it is hard to audit code for this issue, so it would be nice to have some defense in depth at runtime at least against the worst of the worst.

As a hack, one could of course instead provide a hardened Writer by assuming that Logger.Output calls Write precisely once. Not sure if such a guarantee should be made, though.

@ianlancetaylor ianlancetaylor changed the title "log" allows special characters to be logged log: allows special characters to be logged Dec 20, 2021
@ianlancetaylor
Copy link
Contributor

The log package already defines that each logging operation makes exactly one call to the Write method. (This is more or less required to provide good support for simultaneous logging from multiple goroutines.)

@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Dec 21, 2021
@adonovan
Copy link
Member

This is indistinguishable from "Go away, world" being a real log message. Can be used by an attacker ...

I would argue that the log package is the wrong place to solve this purported security vulnerability. The stderr output of a UNIX process is a stream of bytes, some of which are human-readable lines of text that are vaguely useful to a maintainer. That's the extent of the contract. Any dependency package with which your application logic shares stderr is liable to emit burps of unpredictable data. The real security problem is in a tool that assumes the log is structured. (This includes any log viewer that simply serves raw log files in a form that allows a client to interpret them as HTML or JavaScript.)

This category of security problems is recognized by https://cwe.mitre.org/data/definitions/117.html, but I'm skeptical that changing the expectations around stderr is a practical mitigation. And it is frustrating when overzealous static checkers obstruct the build because of a temporary log.Print(req.Form.whatever) statement in an HTTP handler.

@divVerent
Copy link
Author

divVerent commented Dec 27, 2021 via email

@adonovan
Copy link
Member

adonovan commented Dec 27, 2021

But logs ARE structured. They are an array of lines, each being a timestamp and a string.
And exactly this structure should be encoded clearly without exceptions. This is what I am asking for.
My general expectation of a logging system is that it is its responsibility that log messages are
associated with timestamp and separated cleanly no matter what the input.
Right now Go's log package violates this when newlines are in the string to be logged.

I don't think the maintainers of Go's log package share your expectation. The log package does not provide structured logging: its formatting is nothing more than a convenience. If you want structured logging---and that's a perfectly reasonable thing to want---you'll need to use a different package, one that quotes every argument value, and writes log entries to a different sink than stderr, which is fundamentally unstructured. There are many third-party Go logging packages, structured and unstructured; see https://www.client9.com/logging-packages-in-golang/ for examples.

@divVerent
Copy link
Author

divVerent commented Dec 27, 2021 via email

@adonovan
Copy link
Member

adonovan commented Dec 27, 2021

Another option would be stripping or escaping newlines in output, but it
may not be possible to square that with the compatibility promise for 1.x.

Indeed, we can't change the existing behavior. Many tests rely on the precise formatting of error messages.

Yet another option would be just providing a way to do so (e.g. an
io.Writer wrapper that encapsulates write calls to escape inner newlines ...

By the time of the Write call, all the argument structure has been lost, so it's too late to do quoting then.

If it is the case that the log package is not expected to provide safe
logging provided "bad input", it would make sense to document this fact.

The log package's documentation is quite clear about what it does (timestamp, printf, newline, stderr), and I don't see how a reader could conclude from it that the output is structured or unambiguous. But perhaps you can think of a helpful sentence that makes things clearer, without giving credit to the misconception that stderr is anything other than fundamentally unstructured.

@divVerent
Copy link
Author

First of all, it is true that stderr can't be very structured, but stderr is only one possible destination of the log package.

By the time of the write call, indeed internal structure of the log line is lost, however the structure of the log itself (array of timestamped lines) is still there.

As for why a reader could conclude that the output is unambiguous as per the log package's own formatting (again, not WITHIN a line of log but ACROSS), simply because other logging systems, e.g. syslog, do ensure that, while also providing no structure guarantees within each log line. If my C program contains syslog(LOG_INFO, "%s just logged in", username), then there is no way to inject a fake timestamp or second log event into the syslog with that - but clearly a malicious user could set their username so this will log "I rule the world, and root just logged in". I would expect "log" to provide the same guarantees as syslog provided the output is not stderr (or provided I ensure that no code outside the log package writes to stderr, which is usually a given - e.g. I am not aware of any Go package that does so without going via the "log" package, which also ensures serialization of calls for this specific reason).

As logs are text files, having escape sequences in them could be similarly dangerous, so maybe a good recommendation would be

Printf calls Output to print to the standard logger. Arguments are handled in the manner of fmt.Printf.

As logging output tends to be written to terminals or text files, it is recommended to only ever log untrusted (including user-provided) textual data with appropriate quoting, as ensured by the %q or %#v format specifiers.

Similarly, log.Print and log.Println should be entirely discouraged on untrusted textual data.

@adonovan
Copy link
Member

... provided I ensure that no code outside the log package writes to stderr, which is usually a given - e.g. I am not aware of any Go package that does so without going via the "log" package ...

Many Go packages write directly to stderr, sometimes just because debug log wasn't removed.

As for why a reader could conclude that the output is unambiguous as per the log package's own formatting (again, not WITHIN a line of log but ACROSS), simply because other logging systems, e.g. syslog, do ensure that, while also providing no structure guarantees within each log line. If my C program contains syslog(LOG_INFO, "%s just logged in", username), then there is no way to inject a fake timestamp or second log event into the syslog...

If all you need is to ensure that the boundaries of each log entry are unambiguous, the current API affords you the means to do that. Call log.Default().SetOutput, passing an io.Writer that removes interior newlines, perhaps replacing them with the two-byte sequence \n, before passing the result to a call to out.Write, where out is a file other than stderr.

@divVerent
Copy link
Author

divVerent commented Dec 29, 2021 via email

@seankhliao
Copy link
Member

since you bring up syslog, the standard library does have log/syslog

As I see it, the log package makes no claim of safety or that output will all be on one line, only that each call starts on a new line. (I for one would be very annoyed of it started escaping things like newlines automatically). In my experience, if you wanted generic structured output, you'd likely want json or logfmt to make parsing easier, at which point it is out of scope for log.

@divVerent
Copy link
Author

divVerent commented Dec 30, 2021 via email

@seankhliao seankhliao added this to the Unplanned milestone Aug 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants