Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

log/slog: one allocation in TextHandler with ReplaceAttr #61774

Closed
vikstrous opened this issue Aug 5, 2023 · 8 comments
Closed

log/slog: one allocation in TextHandler with ReplaceAttr #61774

vikstrous opened this issue Aug 5, 2023 · 8 comments
Assignees
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@vikstrous
Copy link

vikstrous commented Aug 5, 2023

Disclaimer: I was messing around with slog mostly for my own amusement and none of this comes from an actual need.

What version of Go are you using (go version)?

$ go version go1.21rc4 linux/amd64

Does this issue reproduce with the latest release?

yes, with go1.21rc4

What operating system and processor architecture are you using (go env)?

not relevant

What did you do?

When using TextHandler with slog and setting ReplaceAttr, there's one allocation per log rather than zero.
Note that with JSONHandler, there are 4 allocations reported by this test and I haven't tracked them down.

package alloc_test

import (
	"io"
	"log/slog"
	"testing"
)

func TestAlloc(t *testing.T) {
	pureLogger := slog.New(slog.NewTextHandler(io.Discard, nil))
	replaceLogger := slog.New(slog.NewTextHandler(io.Discard, &slog.HandlerOptions{
		ReplaceAttr: func(groups []string, a slog.Attr) slog.Attr { return a },
	}))
	pureAllocs := testing.AllocsPerRun(1, func() {
		pureLogger.Info("hello")
	})
	replaceAllocs := testing.AllocsPerRun(1, func() {
		replaceLogger.Info("hello")
	})
	if pureAllocs != replaceAllocs {
		t.Fatalf("allocs without ReplaceAttr %.0f != allocs with ReplaceAttr %.0f", pureAllocs, replaceAllocs)
	}
}

What did you expect to see?

Test passing

What did you see instead?

allocs without ReplaceAttr 0 != allocs with ReplaceAttr 1

More context

The difference is that without ReplaceAttr, Level is directly converted to string and with it, it's kept as the Level type so that ReplaceAttr can act on the Level type.

When it comes time to print Level's value, its implementation of MarshalText returns a byte slice, but the conversion from string to byte slice causes an allocation. This is basically a limitation of the TextMarshaler interface. Not that everything we are seeing here applies to any marshaling of slog.Any wrapped values.

I don't think there's a more efficient way to implement MarshalText for Level. I considered two bad alternatives:

  • Level.MarshalText can use the unsafe package to create a byte slice from a string without the allocation. The weird thing about this is that it exposes the memory in such a way that if the caller of MarshalText assigns to any of the returned bytes, it causes a segfault. It would be weird for the caller to do that, but risking a segfault is not something we'd want to expose anyone to.
  • Level.MarshalText can return pre-allocated global slices for the standard levels. This is also less safe that what happens now, but instead of a segfault, assigning to the returned slice results in log levels being printed incorrectly for future logs.

Proposed solution

Prioritizing Stringer over TextMarshaler would solve this issue. We can put this:

		if st, ok := v.Any().(fmt.Stringer); ok {
			s.appendString(st.String())
			return nil
		}

right here above the TextMarshaler check.

Right now String() is effectively what's called if TextMarshaler is not implemented, but it's called indirectly through fmt.Sprintf("%+v", v.Any()), which causes an allocation. By putting the stringer option first and doing it explicitly, we give all "Any" values an opportunity to marshal themselves efficiently as strings.

A similar trick can be done in appendJSONMarshal for the json case to remove all 4 allocations. I think it wouldn't be completely crazy for slog to prefer to format values as strings regardless of the encoding of the logs. It can be argued that the encoding of the logs shouldn't affect the content of the logs and the preference for using Stringer would make the two implementations more consistent with each other.

@vikstrous vikstrous changed the title affected/package: log/slog - one allocation in TestHandler with ReplaceAttr log/slog: one allocation in TestHandler with ReplaceAttr Aug 5, 2023
@mknyszek
Copy link
Contributor

mknyszek commented Aug 7, 2023

CC @jba

@mknyszek mknyszek added Performance NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Aug 7, 2023
@mknyszek mknyszek added this to the Backlog milestone Aug 7, 2023
@jba jba self-assigned this Aug 7, 2023
@bcmills bcmills changed the title log/slog: one allocation in TestHandler with ReplaceAttr log/slog: one allocation in TextHandler with ReplaceAttr Aug 7, 2023
@vikstrous
Copy link
Author

Arguing against my proposed solution, I can see two issues:

  • TextMarshaler has a corresponding TextUnmarshaler, so the encoding is expected to be reversible. Stringer is not expected to be reversible. Reversibility seems desirable for encoding of values in logs.
  • The only time an allocation can be avoided is when the string has been pre-computed somehow. The best way to pre-computed the string is to not use slog.Any in the first place. The caller can just use slog.String instead with a pre-formatted string.

I'm tempted to close this issue... But maybe something can be done for Level in particular. I wonder if the custom ReplaceAttr function can replace the attr.Any with a pre-allocated string and avoid the allocation that way. I'll try that and report the results.

@vikstrous
Copy link
Author

vikstrous commented Aug 8, 2023

Yes! Adding this to the ReplaceAttr function removes the allocation.

Global map:

var levelNames = map[slog.Level]slog.Value{
	slog.LevelDebug: slog.StringValue("DEBUG"),
	slog.LevelInfo:  slog.StringValue("INFO"), // default
	slog.LevelError: slog.StringValue("ERROR"),
	slog.LevelWarn:  slog.StringValue("WARN"),
}

Inside the function:

	if a.Value.Kind() == slog.KindAny {
		level, ok := a.Value.Any().(slog.Level)
		if ok {
			a.Value = levelNames[level]
		}
	}

@vikstrous
Copy link
Author

I think this is more of an obscure gotcha with a subtle workaround rather than something that needs to be fixed. Maybe it's worth a small mention in the documentation that any use of slog.Any causes an allocation because of the need to marshal the value and when ReplaceAttr is used, the level is wrapped in slog.Any, so it's a good idea to return pre-allocated strings for the level if writing your own ReplaceAttr function? Maybe this is too obscure and low impact to be worth a mention? At least having this issue on github now might save someone time if they go down the same rabbit hole.

vikstrous added a commit to vikstrous/slogexamples that referenced this issue Aug 8, 2023
vikstrous added a commit to vikstrous/slogexamples that referenced this issue Aug 8, 2023
@jba
Copy link
Contributor

jba commented Sep 1, 2023

@vikstrous thanks for traveling down this rabbit hole! I think adding anything to the doc about this will just make people think too much about allocation. It almost never matters. But as you say, we have this issue to point people to if they notice.

One problem with your solution: a slog.Level can be any integer, so you have to handle the case where the key is not in your map.

@seankhliao
Copy link
Member

perhaps encoding.AppendText proposed in #62384 is relevant

@artyom
Copy link
Member

artyom commented Oct 6, 2023

@vikstrous @jba I found that this snippet in ReplaceAttr avoids allocations just as well as the map version, but is more contained and tries to avoid false positives on arbitrary integers:

func(groups []string, a slog.Attr) slog.Attr {
	if groups == nil && a.Key == slog.LevelKey && a.Value.Kind() == slog.KindAny {
		if level, ok := a.Value.Any().(slog.Level); ok {
			a.Value = slog.StringValue(level.String())
		}
	}
	return a
}

If I apply the following change directly to log/slog source:

diff --git src/log/slog/handler.go src/log/slog/handler.go
index 03d631c0ac..72a39ec9a6 100644
--- src/log/slog/handler.go
+++ src/log/slog/handler.go
@@ -290,7 +290,7 @@ func (h *commonHandler) handle(r Record) error {
 		state.appendKey(key)
 		state.appendString(val.String())
 	} else {
-		state.appendAttr(Any(key, val))
+		state.appendAttr(String(key, val.String()))
 	}
 	// source
 	if h.opts.AddSource {

some of the package benchmarks report reduced allocations:

goos: darwin
goarch: arm64
pkg: log/slog
                              │ /tmp/before.txt │           /tmp/after.txt            │
                              │     sec/op      │   sec/op     vs base                │
JSONHandler/defaults-8              947.9n ± 0%   952.5n ± 0%   +0.49% (p=0.000 n=10)
JSONHandler/time_format-8           1.547µ ± 0%   1.241µ ± 0%  -19.78% (p=0.000 n=10)
JSONHandler/time_unix-8             1.374µ ± 0%   1.075µ ± 0%  -21.76% (p=0.000 n=10)
JSONEncoding/json.Marshal-8         112.2n ± 0%   111.8n ± 0%   -0.40% (p=0.000 n=10)
JSONEncoding/Encoder.Encode-8       105.9n ± 0%   105.6n ± 0%   -0.28% (p=0.026 n=10)
geomean                             474.0n        431.7n        -8.93%

                              │ /tmp/before.txt │             /tmp/after.txt              │
                              │      B/op       │    B/op     vs base                     │
JSONHandler/defaults-8             0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
JSONHandler/time_format-8         152.00 ± 0%     32.00 ± 0%   -78.95% (p=0.000 n=10)
JSONHandler/time_unix-8            120.0 ± 0%       0.0 ± 0%  -100.00% (p=0.000 n=10)
JSONEncoding/json.Marshal-8        16.00 ± 0%     16.00 ± 0%         ~ (p=1.000 n=10) ¹
JSONEncoding/Encoder.Encode-8      8.000 ± 0%     8.000 ± 0%         ~ (p=1.000 n=10) ¹
geomean                                       ²               ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean

                              │ /tmp/before.txt │             /tmp/after.txt              │
                              │    allocs/op    │ allocs/op   vs base                     │
JSONHandler/defaults-8             0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
JSONHandler/time_format-8          4.000 ± 0%     1.000 ± 0%   -75.00% (p=0.000 n=10)
JSONHandler/time_unix-8            3.000 ± 0%     0.000 ± 0%  -100.00% (p=0.000 n=10)
JSONEncoding/json.Marshal-8        2.000 ± 0%     2.000 ± 0%         ~ (p=1.000 n=10) ¹
JSONEncoding/Encoder.Encode-8      1.000 ± 0%     1.000 ± 0%         ~ (p=1.000 n=10) ¹
geomean                                       ²               ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean

All tests pass, with the exception of ExampleHandlerOptions_customLevels, which unfortunately relies on the current implementation detail:

// Handle custom level values.
level := a.Value.Any().(slog.Level)

@vikstrous
Copy link
Author

vikstrous commented Oct 9, 2023

Thanks, @artyom! I pasted your example ReplaceAttr function into my test and it passes.

I don't think the change to the standard library should be made because it would make it harder to work with levels in ReplaceAttr implementations, but the snippet you shared can be used by anyone that cares about getting to 0 allocations with ReplaceAttr. I think there's nothing more to be said on this topic, so I'll close the issue :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

5 participants