Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/vet: false positive printf-like function detection for github.com/leonelquinteros/gotext #57288

Closed
divVerent opened this issue Dec 13, 2022 · 19 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.

Comments

@divVerent
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.19.2 linux/amd64

Does this issue reproduce with the latest release?

Yes: https://github.com/divVerent/aaaaxy/actions/runs/3685154878

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/rpolzer/.cache/go-build"
GOENV="/home/rpolzer/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/rpolzer/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/rpolzer/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go-1.19"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go-1.19/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.19.2"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3901397724=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Used "go vet" on a program that calls Get("100%") on a Po object. The method is defined here:

https://github.com/leonelquinteros/gotext/blob/6998e37f95499dad532f617cb0ff07fdbabc6ef7/po.go#L83

What did you expect to see?

This is valid, as the call goes through

https://github.com/leonelquinteros/gotext/blob/6998e37f95499dad532f617cb0ff07fdbabc6ef7/domain.go#L252

https://github.com/leonelquinteros/gotext/blob/6998e37f95499dad532f617cb0ff07fdbabc6ef7/helper.go#L37

I.e. this is only a printf-ish function if receiving more than one arg - if exactly one arg is provided, it is NOT interpreted as a format string.

What did you see instead?

# github.com/divVerent/aaaaxy/internal/font
Error: internal/font/charset.go:23:9: (*github.com/leonelquinteros/gotext.Po).Get format %& has unknown verb &
# github.com/divVerent/aaaaxy/internal/playerstate
Error: internal/playerstate/playerstate.go:237:10: (*github.com/leonelquinteros/gotext.Po).Get format % is missing verb at end of string
Error: internal/playerstate/playerstate.go:262:10: (*github.com/leonelquinteros/gotext.Po).Get format % is missing verb at end of string
Error: internal/playerstate/playerstate.go:266:10: (*github.com/leonelquinteros/gotext.Po).Get format % is missing verb at end of string

Looks like as a workaround I have to call

po.Get("100%s", "%")

to get both the correct output AND no go vet errors. This is kinda sad. Not so great for the translator, who will needlessly see a format string.

On #17058 it has been agreed that we want NO annotation mechanism to silence vet warnings - so I guess this means we need to fix this false positive instead.

@divVerent
Copy link
Author

divVerent commented Dec 13, 2022

(BTW, as for why I use gotext and not x/text/messages - I really insist on a file format with lots of good editing tools such as gettext's AND need to load them from an in-game VFS [possibly the user's file system just like other game data]; I couldn't figure out how to do that with x/text/messages).

@thanm thanm added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Dec 13, 2022
@thanm
Copy link
Contributor

thanm commented Dec 13, 2022

@thanm
Copy link
Contributor

thanm commented Dec 13, 2022

How exactly would you propose to fix this?

@divVerent
Copy link
Author

divVerent commented Dec 13, 2022 via email

@timothy-king
Copy link
Contributor

#57288 (comment) I think comes down to whether gotext.Printf is a "printf-like" function:

func gotext.Printf(str string, vars ...interface{}) string {
	if len(vars) > 0 {
		return fmt.Sprintf(str, vars...)
	}
	return str
}

I am curious what the author of this code (@leonelquinteros) thinks the decision should be. Is "gotext.Printf" a wrapper of "fmt.Sprintf" and it is desired for go vet users to be warned on format strings here? Or no because a different format (like having a trailing"%") should be supported? Or only when len(vars) > 0**?

This does not necessarily dictate what vet should do, but I am curious.

** I doubt a len(vars) > 0 requirement would make sense for vet. I am just interested if that would change opinions.

So maybe not detect the function as a printf function if the call to a
printf function is conditional?

Right now the printf.Analyzer checks whether there is some forwarded call. There are a few different ways we could choose to restrict based on a condition.

  1. We could still consider a function a wrapper if all paths reach a printf wrapper. Pro: precise. Con: Challenging to retrofit & requires control flow.
  2. We could choose to only consider it a wrapper if a call to a printf-like is in the func's block and no return statement or panic precedes it (i.e. some printf-like dominates the function's entry). Pro: Seems like it would capture most cases in the wild? Somewhat simple to implement. Con: Won't capture as many cases as the previous suggestion, if p { return fmt.Sprintf(fmt, args...) }; ...; fmt.Printf(fmt, args...); return "lorem ipsum";. Not clear if this distinction arises in practice.
  3. We could stop if there is any preceding branching. Pro: Easy to retrofit. Con: Somewhat restrictive so hard to predict the impact, e.g. if p { irrelevant(); }; fmt.Printf(fmt, args...) would not longer be a wrapper.

Given the relative rarity of the issue, I think we should test candidate solutions before making a decision.

po.Get("100%s", "%")

@divVerent An alternative stopgap is to have a function that printf.Analyzer cannot infer is a wrapper. Such as:

 func foo(fmt first, args ...interface{}) {
   id := func(s string) string { return s } // https://github.com/golang/go/issues/57288
   po.Get(id(first), ...args)
 }

Not a great solution, but it may be preferable to po.Get("100%s", "%"). You would need one of these per printf-like function from another package.

divVerent added a commit to divVerent/aaaaxy that referenced this issue Dec 14, 2022
Instead of using an artificial format string, I am now accessing the Get
method on the Po object via the Translator interface. Accessing via an
interface makes it impossible for go vet to detect if it's a printf-like
function, thereby evading the false positives due to Get not acting like
printf if it only gets one arg.
@divVerent
Copy link
Author

Obscuring the format string with an id() wrapper or similar won't work as then xgotext will no longer detect these strings as translatable and put them in the .pot file.

However I found a workaround - it seems like xgotext just looks at the package name of the type of the receiver, not the type itself. And the package is so nice to define a Translator interface that *Po implements - so I can go via that interface whenever I need to evade go vet's printf detection.

This is still dirty and against the philosophy in #17058 to NOT require explicit annotations of any form to evade false positives - instead go vet should not HAVE false positives.

@divVerent
Copy link
Author

divVerent commented Dec 14, 2022

What speaks AGAINST skipping printf detection if the function is conditional is, of course, constructs like log.V(1).Infof("hello, world"). And to be honest I don't really see a good way how go vet can distinguish

func (v V) Infof(format string, args ...any) {
  if v <= *verbosity {
    fmt.Printf(format, args...)
  }
}

(this SHOULD count as printf-like)

from

func (p *Po) Get(format string, args ...any) string {
  if len(args) != 0 {
    return fmt.Sprintf(format, args...)
  } else {
    return format
  }
}

(this is what this package is doing)

The only real difference is that here in theory an analyzer could go through the entire thing and notice that the comparison is only based on the arguments and literals, while in the log.V case, a flag or otherwise external value is involved and the analyzer can prove nothing.

@timothy-king
Copy link
Contributor

The verbosity example you gave would be blocked by all of my suggestions. So none of those seem like the right call.

The only real difference is that here in theory an analyzer could go through the entire thing and notice that the comparison is only based on the arguments and literals, while in the log.V case, a flag or otherwise external value is involved and the analyzer can prove nothing.

It might just be reading args or format in a context other than a matching call? Maybe it is just using one before a matching call?

Another distinguishing characteristic is fmt.Sprintf vs fmt.Printf.

Worth also pointing out that someof printf's unit tests would likely need to be modified to support the types of restrictions needed: https://cs.opensource.google/go/x/tools/+/refs/tags/v0.4.0:go/analysis/passes/printf/testdata/src/b/b.go

@divVerent
Copy link
Author

Another thing one could do is to take the name of the argument into account - and e.g. not vet if it is named maybeFormat.

OTOH, I would actually want most calls to Get be vetted here.

If really no solution is feasible, I will maybe suggest to the author of the package to add four more methods (GetL, GetLN, GetLC, GetLNC) where L means literal (takes no further arguments), to be used when not formatting. Sadly they can't ditch the "1 arg means no formatting" rule anymore as they too have dependencies (but could introduce the literal versions now and ditch the 0/n distinction in next major version). For myself I have a working workaround now, although only thanks to lucky circumstances (that interface the package defines primarily so it can use both PO and MO files).

@leonelquinteros
Copy link

Hello,
I'm catching up with the thread after being summoned by Timothy.
In my opinion (which I just formed quickly after reading the thread), I think gotext.Printf shouldn't be considered a fmt.Sprintf wrapper, my reasons for that are:

  • The most common usage of gotext.Printf is without parameters and so not going through fmt.Sprintf
  • It's common to include special chars in translation strings that may or may not be using fmt.Sprintf
  • Another use case got gotext is to translate strings that will be used by some printf-like function after translation, so it's expected these contain control chars inside.
  • go vet is a tool that should help developers, and on this case the balance between true and false positives will be very even if not favoring the false ones, so I'd say better to skip this one.

For the specific needs of validating format strings and function calls inside gotext, I'd say it's better to create a tool inside this package to perform this tasks, I think this matches the spirit of the package as well.

That said, I'm not sure how vet is detecting the wrapping of fmt.Sprintf, so it may not be straight forward to exclude gotext.Printf from the detection heuristics.

@divVerent
Copy link
Author

That sounds good - xgotext can be the tool to do format string validation, after all.

Based on the above, vet apparently transitively follows the call chain to a printf function. There may be tricks to make it not notice a format string, like fmt.Sprintf(""+format, args...), although these tricks may stop working at some point when vet gets more clever.

@timothy-king
Copy link
Contributor

Related #26555.

@timothy-king
Copy link
Contributor

An alternative to fmt.Sprintf(""+format, args...) is _ = &format; fmt.Sprintf(format, args) as an "annotation" that a function is not "printf-like". There is an explicit check for it, and I don't think vet will get too clever for taking addresses any time soon.

@adonovan
Copy link
Member

adonovan commented Dec 14, 2022

I think gotext.Printf shouldn't be considered a fmt.Sprintf wrapper

It is a printf wrapper, since it may forward its arguments to printf. It does so only conditionally, but that's true of a lot of printf wrappers, such as loggers with a verbosity control. The only problem I see is that gotext.Printf interprets its arguments inconsistently: sometimes as a format string and sometimes literally. This seems just as likely to confuse human readers as mechanical ones: if a Go programmer calls a function named Printf and wants to see the output "100%", they will follow Go conventions and pass the argument "100%%", and the function will not do what they expect.

I suggest you change gotext.Printf to delegate to fmt.Sprintf unconditionally. Or remove the "f" suffix and change its behavior to delegate to fmt.Print. Or if neither of those is appealing, change the function body to hide from vet the fact that it delegates.

@divVerent
Copy link
Author

I disagree with this solution - yes, the function called Printf really should not be called that.

But the functions I was about are NOT called Printf. They are called Get, GetC, GetN, GetNC.

And apparently the author intended them to not take format strings if there is just one arg.

In a /v3 of the package that may be something to reconsider - how do we handle this in /v2 though?

Suggestion: can we maybe have a global variable (with a function to enable this mode) in the package that enables always-formatstring handling so I can migrate my project to that?

@adonovan
Copy link
Member

yes, the function called Printf really should not be called that.
But the functions I was about are NOT called Printf. They are called Get, GetC, GetN, GetNC.

It's not just the name, it's the behavior of the function that's the problem. 'Get' interprets its 'format' argument inconsistently: sometimes as a format string, and sometimes not. The convention throughout the Go libraries is that functions choose one or the other (and ideally reflect this choice in their name).

And apparently the author intended them to not take format strings if there is just one arg.

I'm asserting that this was a design mistake, since it confuses not only vet, but human readers too. (Perhaps vet should attempt to report "function interprets format string argument inconsistently"?)

In a /v3 of the package that may be something to reconsider - how do we handle this in /v2 though?

If you want to disable vet checking for this function, then change its body as described above to obscure the delegation to printf. Or you could split Get into two functions, one Printf-like and one Print-like. If you can't change Get itself, you can write two wrappers, and use them instead. You can add if false { _ = fmt.Sprintf(format, args...) } to a function to force vet to interpret it as a printf wrapper, or use ""+format, as you pointed out, to do the opposite.)

Suggestion: can we maybe have a global variable (with a function to enable this mode) in the package that enables always-formatstring handling so I can migrate my project to that?

I'm not sure what you mean. If you can change the function, you can trick vet into treating it as as printf wrapper or not. If you can't change the function, there is no way to change vet's interpretation.

no way to get both the correct output AND no go vet errors. This is kinda sad.

To be clear, I agree, it is sad that an API design mistake in one of your dependencies causes vet to report a false positive. In effect, vet imposes a stronger type on Get than Go itself, and the type of Get cannot be expressed in this stronger type system, and, like any type system, it rejects some programs that would run just fine. But since there is a design mistake in the code and there are simple workarounds, I don't think it's worth trying to change vet to accommodate this case.

@divVerent
Copy link
Author

So the idea is that leonelquinteros puts in the ""+format hack into the library's /v2 as it is right now, and /v3 will use separate functions for literals and format string.

@leonelquinteros is that OK with you?

@leonelquinteros
Copy link

@divVerent I'm not sure about maintaining multiple versions of the package just for this change.

About the design decision, it's not a mistake, it was made on purpose and it's documented, in fact, it's the only documentation for that function: https://pkg.go.dev/github.com/leonelquinteros/gotext#Printf , so readers should be safe.
I agree it's ugly when seen from this perspective, but it was due to organic grow, as it deviated from gettext standard, but found useful for some package users.

As I said before, if you need to make vet not warn about this, even when the documentation explicitly say that Vet uses heuristics that do not guarantee all reports are genuine problems (https://pkg.go.dev/cmd/vet) in the very first paragraph, there are several workarounds for your case and some have been already discussed in this thread.

@leonelquinteros
Copy link

I can foresee a /v2 version that has Get and GetWithParams (or something) for the 2 separate use cases, but starting a /v2 version would include a lot of other decision changes that have been discussed in the project before and that will require more effort, so ETA for that would be really long given the maintenance rate i put into the project this days.

PRs are always welcome though :) we can start in a branch for your specific use case and move from there

@golang golang locked and limited conversation to collaborators Dec 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants