Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: remove fmt dependency? #15583

Closed
cathalgarvey opened this issue May 6, 2016 · 25 comments
Closed

encoding/json: remove fmt dependency? #15583

cathalgarvey opened this issue May 6, 2016 · 25 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@cathalgarvey
Copy link

Would there be interest in a PR that removes fmt as a dependency in encoding/json, if doing so didn't overly complicate the code?

I ask because fmt is a large dependency to pull in for what appears to be exclusively error formatting in decode.go.

Happy to contribute this.

@bradfitz bradfitz changed the title Remove fmt from encoding/json? encoding/json: remove fmt dependency? May 6, 2016
@bradfitz bradfitz added this to the Go1.8Maybe milestone May 6, 2016
@bradfitz
Copy link
Contributor

bradfitz commented May 6, 2016

Maybe? Got any before/after numbers?

/cc @crawshaw

@cathalgarvey
Copy link
Author

None yet; you mean for binary size or performance-in-error?

Contributing to Go is something of a time investment so I didn't bother doing any work before checking if there was interest.

@bradfitz
Copy link
Contributor

bradfitz commented May 6, 2016

For whatever metric you cared enough about to file this bug.

@cathalgarvey
Copy link
Author

Ha, OK: background then. I'm using GopherJS to transpile some projects to JS for a dashboard. While this is the sort of case where large JS files are OK, because it's a long-running browser App and load-times are OK, I noticed that the binaries were huge.

Two obvious culprits were net/http, which was pulling in transpilations of the entire TLS stack, and fmt, which has a reputation for being large.

So, while Go binaries being large is generally simply taken as a fact of life in go build land, over in gopherjs build land it's another story, because 3.5M JS libraries are essentially useless unless the comprise the whole core product.

..hence my recent crusade to try and reduce the use of fmt.Errorf when errors would suffice. I don't yet know that to be the case in encoding/json, because at least here the f part of Errorf appears to be leveraged, as is the de-interfacing powers of fmt.

@bradfitz
Copy link
Contributor

bradfitz commented May 6, 2016

encoding/json already depends on strconv which has https://golang.org/pkg/strconv/#Quote and is about the only interesting use of fmt in the json package.

So try it and see how much smaller the Javascript gets.

@minux
Copy link
Member

minux commented May 6, 2016 via email

@cathalgarvey
Copy link
Author

Is there not a mild intrinsic value in partitioning, anyway? And value also in using the appropriate tool, perhaps in this case errors for minor errors?

As I said, formatting of interface{}s may be important here when I dig in. In which case, fmt may be the best fit for purpose. If not, why keep it?

On 7 May 2016 00:23:24 IST, Minux Ma notifications@github.com wrote:

How many applications will benefit from this?

I imagine it'd be find an application that only uses encoding/json but
not
fmt or any other packages that use fmt.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#15583 (comment)

Sent from my Android device with K-9 Mail. Please excuse my brevity.

@minux
Copy link
Member

minux commented May 7, 2016 via email

@josharian
Copy link
Contributor

The other benefit to fewer dependencies is that the dependency graph is sparser, so more things can be compiled concurrently. This applies whether or not the code uses fmt somewhere else. The go command is not as good at taking advantage of this now as it could be, but I hope that that will improve next cycle.

I think the patch could be made nicer by introducing local functions errorf and errorf2 (naming improvements welcome), at which point the patch is basically s/fmt.Errorf/errorf/ or s/fmt.Errorf/errorf2 as appropriate.

@cathalgarvey
Copy link
Author

@josharian: Interesting, I hadn't even considered that; still a lot of compiler-fu to learn. :)

@minux: I don't think that harms readability much, personally? Also, 9% is pretty significant in my book, and may work out even better for downstream hax like GopherJS. For clarity, a variadic local errorf as @josharian suggests would clean things up significantly.

@cathalgarvey
Copy link
Author

cathalgarvey commented May 8, 2016

@minux: I made some changes to some GopherJS code to avoid encoding/json, using my fmt-free fork of github.com/kurrik/json instead. It shaved about .5Mb from the resulting unminified JS (1.5Mb->961k), which was a bit over 33%. So, while the premium of 9% in pure-Go might seem mild, for GopherJS it's a huge difference.

Addendum; these successful reductions also came from removing fmt from another upstream tool; took me a while to hunt it down in that dependency and I ended up pulling it into my tree. After removing net/http from my GopherJS code, which removed ~1.5Mb, the biggest reduction has been the 0.5Mb reduction from removing fmt. It matters, this can be the difference between using Go and simply using straight JS!)

@minux
Copy link
Member

minux commented May 8, 2016 via email

@cathalgarvey
Copy link
Author

Plenty of my packages use JSON but not fmt? Not all applications push out human-readable text output yunno. ;)

On 8 May 2016 20:55:52 IST, Minux Ma notifications@github.com wrote:

The huge reduction probably suggests there is something
very inefficient about the fmt package. I'd suggest figure
out why GopherJS compiler generates so much Javascript
for the fmt package first.

The 9% reduction only applies to toy examples. Again, I
can't think of any real world Go application that only uses
encoding/json, but not (directly or indirectly) fmt.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#15583 (comment)

Sent from my Android device with K-9 Mail. Please excuse my brevity.

@minux
Copy link
Member

minux commented May 8, 2016

fmt is not only about user output.

Of the 155 std packages, 83 of them uses fmt.
Pretty much all io related packages (net, net/http,
os/user, log) use fmt, so I will be very surprised
to learn some useful Go application that doesn't
use fmt at all.

Basically, the only way such a program can interact
with the outside world without fmt package is via
the syscall and os packages. And such a program
can't even use the flag package....

@cathalgarvey
Copy link
Author

It's no surprise that text-heavy tools use fmt, but a lot of packages use it only for Errorf. If the argument is that fmt is unavoidable, and we should learn to love to bomb.. I don't buy it.

If they're all like encoding/json and a little effort could strip it, then why not incrementally decouple the stdlib and remove it?

I've just stripped encoding/xml, and it was harder. Formatting was used more often for quoting and interface conversions. But even then, the types and formats were all the same; string, int, and interface.Type. The full power of fmt was not warranted, merely a helper that would convert and concatenate variadic simply types.

On 8 May 2016 23:33:31 IST, Minux Ma notifications@github.com wrote:

fmt is not only about user output.

Of the 155 std packages, 83 of them uses fmt.
Pretty much all io related packages (net, net/http,
os/user, log) use fmt, so I will be very surprised
to learn some useful Go application that doesn't
use fmt at all.

Basically, the only way such a program can interact
with the outside world without fmt package is via
the syscall and os packages.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#15583 (comment)

Sent from my Android device with K-9 Mail. Please excuse my brevity.

@spenczar
Copy link
Contributor

spenczar commented May 9, 2016

@minux - I think you're right that it would be rare for fmt to be missing from a meaningful program that uses json... but that's not a reason to keep the dependency. Is there any reason that this mild bit of separation is harmful? While you're correct that the benefits are small, it should be done anyway if there is no downside and the change is simple.

@bradfitz
Copy link
Contributor

bradfitz commented May 9, 2016

I agree with @minux that almost all real programs will bring in fmt anyway and you should first understand why fmt compiles to such large GopherJS output code first.

The other possibility is modify the GopherJS compiler to replace fmt.Errorf with a lighter work-alike. But that's kinda fixing the wrong problem.

I don't want to start making every package depending on fmt have its own fmt-like "minimal" versions. That would mean that in a real program we'd have both the actual fmt implementation as well as all the "minimal" fmt clones, making the size problem even worse.

I'd rather just see the linkers (Go and GopherJS's) fixed to do more dead code elimination, if that's the issue. (sometimes unnecessary generated init funcs pin things that might not otherwise be needed).

But if it turns out that a perfect linker couldn't dead-code eliminate something (perhaps because the code paths to take depend on the Errorf format string), perhaps only then could we change something to give the linker more information. But only if it helps significantly.

@cathalgarvey
Copy link
Author

Personally I feel that, architecturally, fmt would probably have more logically been broken into the scan and formatting functions. When I import fmt, am I not also importing the entire Scan tooling? At least in GopherJS this appears to be the case?

As it happens, making "lite" versions for my GopherJS projects is exactly what I'm doing right now. I'm making a fmt clone for just the most common 99% of fmt usecases in the wild: fmt.Errorf for simple errors, and fmt.Println/fmt.Sprintf for simple datatypes and format strings. Most of the extra 'goodies' in fmt are unnecessary for these cases.

Perhaps the most valuable change would be to make errors just a little bit richer, to entice people away from fmt.Errorf for things that don't really call for it. If errors.New had been a variadic that accepts and default-formats inputs (a la "%v"), joining without spaces, then perhaps people wouldn't be using fmt.Errorf so much. :)

@bradfitz
Copy link
Contributor

bradfitz commented May 9, 2016

When I import fmt, am I not also importing the entire Scan tooling? At least in GopherJS this appears to be the case?

A good linker should discard things which are unused. I don't know whether GopherJS can do that.

@randall77
Copy link
Contributor

randall77 commented May 9, 2016

None of the Scanf code is included in a binary if you use only the print functions in fmt.
(This was fixed in https://go-review.googlesource.com/17398/)
As Brad says, this may be an issue in the GopherJS linker, whatever "linker" means in that context.

@dmitshur
Copy link
Contributor

dmitshur commented May 9, 2016

I don't want to start making every package depending on fmt have its own fmt-like "minimal" versions. That would mean that in a real program we'd have both the actual fmt implementation as well as all the "minimal" fmt clones, making the size problem even worse.

I agree with @bradfitz here. I think that trying to remove a single dependency from a package in order to optimize performance is setting a bad precedent. The Go programming language and its compilers are meant to make our jobs, the humans, easier, and I want to continue to be be able to write natural and idiomatic Go code.

I'd rather just see the linkers (Go and GopherJS's) fixed to do more dead code elimination, if that's the issue. (sometimes unnecessary generated init funcs pin things that might not otherwise be needed).

This is where I'd like to see time and effort being spent, rather than trying to eliminate usages of packages in a few places, because that is likely to regress in the future, or cause pain for people trying to write idiomatic Go code because suddenly there are certain packages they shouldn't import (according to fuzzy metrics).

A good linker should discard things which are unused. I don't know whether GopherJS can do that.

But if it turns out that a perfect linker couldn't dead-code eliminate something (perhaps because the code paths to take depend on the Errorf format string), perhaps only then could we change something to give the linker more information. But only if it helps significantly.

GopherJS is able to and currently performs high quality DCE, potentially better than generic JS minifiers can, because it has full access to the original source that's being built and it sees what is statically needed/used. It follows the constraint that it must always produce valid code that works as specified in Go spec; it can't take shortcuts that would be against Go spec.

However, its DCE efforts are significantly thwarted because of run-time reflection via reflect package. That ability means it's not possible to determine at compile type if some types, etc., are not used and eliminate them, because they may be accessed via reflection (which in 99% of cases they are not, but to generate correct programs, that 1% possibility cannot be discarded). /cc @neelance

There is an open issue gopherjs/gopherjs#186 to implement optional "unsafe aggressive DCE" where if you promise not to use reflect to access things that aren't statically mentioned, GopherJS could eliminate them (and if you break your promise, behavior is undefined). However, it's not implemented yet.

@minux
Copy link
Member

minux commented May 9, 2016 via email

@dmitshur
Copy link
Contributor

dmitshur commented May 9, 2016

even with reflect, you still can't use a type that has not been created statically somewhere in the program.

I think that's true for types, but I don't think it is for methods. Sorry, I should've been more precise in my post above (instead of saying "types, etc.").

See https://godoc.org/reflect#Value.MethodByName. An arbitrary program can get user input from keyboard or over network as a string and use MethodByName and Call a method that is not statically mentioned anywhere.

Not being able to eliminate methods causes a chain reaction, since those methods will mention other funcs, types and their methods, and usually pull in large parts of Go packages that are not statically used.

See also recent @crawshaw's changes on linker DCE and reflect.

Those are helpful, thanks for pointing them out! For reference, what I found is what's mentioned at bottom of #6853 and https://github.com/golang/go/commits/master?author=crawshaw.

@minux
Copy link
Member

minux commented May 10, 2016 via email

@quentinmit quentinmit added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Oct 10, 2016
@rsc
Copy link
Contributor

rsc commented Oct 20, 2016

No thanks. I think the expectation is that basically every Go program in the world is going to import fmt one way or the other.

@golang golang locked and limited conversation to collaborators Oct 20, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

10 participants