New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
log: allows special characters to be logged #50277
Comments
Does |
It absolutely does. So part of a solution could be making REALLY CLEAR to EVERY user of the package that %v and %s should only ever be used for strings that cannot contain user input. Also, %q cannot be used safely on structs: https://go.dev/play/p/FLoC2-CJSwk, so %#v seems to be the only safe way to log structs with user input. But yes, I kinda am open to a documentation-only solution to this issue. |
I suppose what you're really after might be a Of these, I like the second idea better. But there's also an argument to be made that So this becomes a kind of hard thing to specify generally, which I guess is why |
I think we have two problems there. Problem 1: User input can look like e.g. spaces around log messages. A possible solution could be creating something "between" %#v and %+v" that is safe for structs containing user input and virtually anything else (but prints less "noise" - think of a "%+v that uses %q for strings") and till then to recommend that anyone who logs user input shall use %q, %#v or explicit numeric formats, but never %v or %s. I like your idea of a %^v format for this purpose; sadly it is likely not possible to just make %+v quote strings, as that is likely to break existing code. Problem 2: Multi-line log messages are never "safe". This could be solved as suggested above - or we could explicitly not solve it but explain in documentation that logging of strings containing newlines is strongly discouraged because the output format becomes ambiguous. In addition, we could add as new Logger flag to "harden" the logger against this - basically, a flag Lhardened that makes Logger.Output sanitize what it prints to the writer. Why I believe this is needed even if Problem 1 is solved is that existing code may already be vulnerable, and it is hard to audit code for this issue, so it would be nice to have some defense in depth at runtime at least against the worst of the worst. As a hack, one could of course instead provide a hardened Writer by assuming that Logger.Output calls Write precisely once. Not sure if such a guarantee should be made, though. |
The log package already defines that each logging operation makes exactly one call to the |
I would argue that the log package is the wrong place to solve this purported security vulnerability. The stderr output of a UNIX process is a stream of bytes, some of which are human-readable lines of text that are vaguely useful to a maintainer. That's the extent of the contract. Any dependency package with which your application logic shares stderr is liable to emit burps of unpredictable data. The real security problem is in a tool that assumes the log is structured. (This includes any log viewer that simply serves raw log files in a form that allows a client to interpret them as HTML or JavaScript.) This category of security problems is recognized by https://cwe.mitre.org/data/definitions/117.html, but I'm skeptical that changing the expectations around stderr is a practical mitigation. And it is frustrating when overzealous static checkers obstruct the build because of a temporary log.Print(req.Form.whatever) statement in an HTTP handler. |
On Mon, Dec 27, 2021, 11:45 Alan Donovan ***@***.***> wrote:
This is indistinguishable from "Go away, world" being a real log message.
Can be used by an attacker ...
I would argue that the log package is the wrong place to solve this
purported security vulnerability. The stderr output of a UNIX process is a
stream of bytes, some of which are human-readable lines of text that are
vaguely useful to a maintainer. That's the extent of the contract. Any
dependency package with which your application logic shares stderr is
liable to emit burps of unpredictable data. The real security problem is in
a tool that assumes the log is structured. (This includes any log viewer
that simply serves raw log files in a form that allows a client to
interpret them as HTML or JavaScript.)
But logs ARE structured. They are an array of lines, each being a timestamp
and a string.
And exactly this structure should be encoded clearly without exceptions.
This is what I am asking for.
As for the ANSI and backspace stuff, I indeed see that as less serious.
After all, it is trivial to create a log viewer not vulnerable to that. In
fact, "cat -e | less" might already be one. The newline injection is what
really worries me.
This category of security problems is recognized by
https://cwe.mitre.org/data/definitions/117.html, but I'm skeptical that
changing the expectations around stderr is a practical mitigation. And it
is frustrating when overzealous static checkers obstruct the build because
of a temporary log.Print(req.Form.whatever) statement in an HTTP handler.
I would agree with you if you were talking about fmt.Println, but
log.Printf is more - it e.g. prepends the date and ensures a newline at the
end.
My general expectation of a logging system is that it is its responsibility
that log messages are associated with timestamp and separated cleanly no
matter what the input. Right now Go's log package violates this when
newlines are in the string to be logged.
And why do I expect that? Well, syslog has the same expectation. I see
log.Printf as competing with syslog, not as competing with
fmt.Sprintf(os.Stderr).
As for stderr itself, this is orthogonal - log.SetOutput aready allows the
user to easily direct log messages to a file. The primary added value of
the log package should be the timestamp and the line separator.
—
… Reply to this email directly, view it on GitHub
<#50277 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAB5NMGHSJE7VZB5NXSYMP3UTCJ2VANCNFSM5KOBOUDA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
I don't think the maintainers of Go's log package share your expectation. The log package does not provide structured logging: its formatting is nothing more than a convenience. If you want structured logging---and that's a perfectly reasonable thing to want---you'll need to use a different package, one that quotes every argument value, and writes log entries to a different sink than stderr, which is fundamentally unstructured. There are many third-party Go logging packages, structured and unstructured; see https://www.client9.com/logging-packages-in-golang/ for examples. |
If it is the case that the log package is no expected to provide safe
logging provided "bad input", it would make sense to document this fact.
As it looks now, the package _looks_ like it provides some guaranteed
structure (as said, timestamps and newlines), so if this is not the case,
this should be stated explicitly.
I should say that I would solving this issue by adding documentation is a
perfectly valid option. In particular this could come in form of
recommending to only ever use numeric specifiers, %q and %#v to log user
input (those three guarantee that no bad characters are in the output).
Another option would be stripping or escaping newlines in output, but it
may not be possible to square that with the compatibility promise for 1.x.
Yet another option would be just providing a way to do so (e.g. an
io.Writer wrapper that encapsulates write calls to escape inner newlines,
kinda emulating a datagram socket, see syslog), and recommending users to
use this output wrapper if they so desire.
…On Mon, Dec 27, 2021, 15:28 Alan Donovan ***@***.***> wrote:
But logs ARE structured. They are an array of lines, each being a
timestamp and a string.
And exactly this structure should be encoded clearly without exceptions.
This is what I am asking for.
My general expectation of a logging system is that it is its
responsibility that log messages are
associated with timestamp and separated cleanly no matter what the input.
Right now Go's log package violates this when newlines are in the string
to be logged.
I don't think the maintainers of Go's log package share your expectation.
The log package does not provide structured logging: its formatting is
nothing more than a convenience. If you want structured logging---and
that's a perfectly reasonable thing to want---you'll need to use a
different package, one that quotes every argument value, and writes log
entries to a different sink than stderr, which is fundamentally
unstructured. There are many third-party logging packages, structured and
unstructured; see https://www.client9.com/logging-packages-in-golang/ for
examples.
—
Reply to this email directly, view it on GitHub
<#50277 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAB5NMBGSND7RVERMEJ7STTUTDD7RANCNFSM5KOBOUDA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Indeed, we can't change the existing behavior. Many tests rely on the precise formatting of error messages.
By the time of the Write call, all the argument structure has been lost, so it's too late to do quoting then.
The log package's documentation is quite clear about what it does (timestamp, printf, newline, stderr), and I don't see how a reader could conclude from it that the output is structured or unambiguous. But perhaps you can think of a helpful sentence that makes things clearer, without giving credit to the misconception that stderr is anything other than fundamentally unstructured. |
First of all, it is true that stderr can't be very structured, but stderr is only one possible destination of the log package. By the time of the write call, indeed internal structure of the log line is lost, however the structure of the log itself (array of timestamped lines) is still there. As for why a reader could conclude that the output is unambiguous as per the log package's own formatting (again, not WITHIN a line of log but ACROSS), simply because other logging systems, e.g. syslog, do ensure that, while also providing no structure guarantees within each log line. If my C program contains As logs are text files, having escape sequences in them could be similarly dangerous, so maybe a good recommendation would be
Similarly, log.Print and log.Println should be entirely discouraged on untrusted textual data. |
Many Go packages write directly to stderr, sometimes just because debug log wasn't removed.
If all you need is to ensure that the boundaries of each log entry are unambiguous, the current API affords you the means to do that. Call |
On Wed, Dec 29, 2021, 17:44 Alan Donovan ***@***.***> wrote:
... provided I ensure that no code outside the log package writes to
stderr, which is usually a given - e.g. I am not aware of any Go package
that does so without going via the "log" package ...
Many Go packages write directly to stderr, sometimes just because debug
log wasn't removed.
Yeah, maybe; the only case I know about is a Go package that uses a C
library. The C library then does that. For this reason I am already using
SetOutput to direct log to a file.
As for why a reader could conclude that the output is unambiguous as per
the log package's own formatting (again, not WITHIN a line of log but
ACROSS), simply because other logging systems, e.g. syslog, do ensure that,
while also providing no structure guarantees within each log line. If my C
program contains syslog(LOG_INFO, "%s just logged in", username), then
there is no way to inject a fake timestamp or second log event into the
syslog...
If all you need is to ensure that the boundaries of each log entry are
unambiguous, the current API affords you the means to do that. Call
log.Default().SetOutput, passing an io.Writer that removes interior
newlines, perhaps replacing them with the two-byte sequence \n, before
passing the result to a call to out.Write, where out is a file other than
stderr.
Yeah, and this is one of the possible solutions I am suggesting - providing
an io.Writer wrapper that does just that.
But as this would be an rather oddball wrapper, not sure about that. A
documentation only solution is fine by me too.
—
… Reply to this email directly, view it on GitHub
<#50277 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAB5NMFE6X2QL3EVM6JCHELUTOFOHANCNFSM5KOBOUDA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
since you bring up syslog, the standard library does have As I see it, the |
On Wed, Dec 29, 2021, 18:05 Sean Liao ***@***.***> wrote:
since you bring up syslog, the standard library does have log/syslog
<https://pkg.go.dev/log/syslog>
As I see it, the log package makes no claim of safety or that output will
all be on one line, only that each call starts on a new line. (I for one
would be very annoyed of it started escaping things like newlines
automatically). In my experience, if you wanted generic structured output,
you'd likely want json or logfmt to make parsing easier, at which point it
is out of scope for log.
Makes sense - however as the package is called "log", it seems reasonable
to expect some kind of structure in the output comparable to what syslog
provides (i.e. not a lot, but certainly some minor integrity guarantees).
If this is out of scope for a package called "log" (and not
"stderrprinter"), it would be a good idea to state this explicitly, and not
just implicitly by omission.
One idea, as above, could be giving the practical recommendation about %#v
and %q (as well as using types that perform appropriate escaping in their
String() method, such as protocol buffers). Another idea could even be
outright stating that the log package is not to be used on untrusted data
including any form of user input without sanitizing it first.
—
… Reply to this email directly, view it on GitHub
<#50277 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAB5NMC6ZSHWB4P4SDXAS3TUTOH4XANCNFSM5KOBOUDA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
https://go.dev/play/p/RwvI7bW06_J
What did you expect to see?
What did you see instead?
In go.dev/play:
This is indistinguishable from "Go away, world" being a real log message. Can be used by an attacker to e.g. make it look like some other user did something bad, getting them banned by a manual admin or some automated log analysis tool (e.g. fail2ban).
On a terminal when printed to stderr:
On a terminal in "less":
In addition, this let me change/hide an existing log message, and change other data in the log message.
This is clearly all rather minor to e.g. what log4j had ;) but still may be serious in some applications. I suggest hardening the "log" package against this by escaping/replacing special characters in log messages; not sure about best handling of newlines (the indent approach above would be great, but I guess any "unique" formatting is fine there).
More of the same: similar "exploits" may be doable by including Unicode LTR/RTL overrides in the log message; this too should ideally be hardened against.
The text was updated successfully, but these errors were encountered: