Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: coalesce consecutive calls to print/println #21417

Open
josharian opened this issue Aug 12, 2017 · 7 comments
Open

cmd/compile: coalesce consecutive calls to print/println #21417

josharian opened this issue Aug 12, 2017 · 7 comments
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Performance
Milestone

Comments

@josharian
Copy link
Contributor

CLs 55095–55098 change how the compiler (and runtime) implement the built-ins print and println, in order to reduce the amount of code they generate. (The goal is to both reduce the runtime's binary footprint and to increase instruction density in important runtime routines.)

This issue spells out one further obvious optimization. Consider:

  println("a")
  println("b")

After CL 55098, this is gets compiled into: printlock, print "a\n", printunlock, printlock, print "b\n", printunlock. But it could get compiled into: printlock, print "a\nb\n", printunlock. In short, coalesce consecutive calls to print/println in a single printlock-protected sequence of calls, combining string constants as possible, taking care to correctly handle the spaces and newlines introduced by println.

Low priority, but might be an interesting learning exercise for someone interested in the mid-tier of the compiler. It's not super straightforward, since walkprint looks at one node at a time, but it also shouldn't too difficult.

@josharian josharian added the Suggested Issues that may be good for new contributors looking for work to do. label Aug 12, 2017
@josharian josharian added this to the Unplanned milestone Aug 12, 2017
@josharian
Copy link
Contributor Author

Although special care is required if evaluating the arguments to the second println could panic. Maybe not a good starter issue after all.

@josharian josharian removed the Suggested Issues that may be good for new contributors looking for work to do. label Aug 12, 2017
@robpike
Copy link
Contributor

robpike commented Aug 12, 2017

I came here to say, "don't do that!" because of argument evaluation.

I use multiple printlns like this to diagnose certain kinds of failure. coalescing them into a single print would completely eliminate that possibility.

In any case, we shouldn't care of println is especially efficient. You shouldn't be using it in production code.

@josharian
Copy link
Contributor Author

I came here to say, "don't do that!" because of argument evaluation.

Of course; we can only do this if it is not detectable from user code. (This would be easier in SSA form, but it is still possible, using package gc's safeexpr.)

In any case, we shouldn't care of println is especially efficient. You shouldn't be using it in production code.

My goal here is shrinking runtime routines that use println, for smaller binaries and better instruction density. I would happily make println slower if it made the call sites smaller or use less stack.

@rasky
Copy link
Member

rasky commented Aug 14, 2017

Tangential, but what about emitting blocks that contain print at the end of the function, as unlikely blocks? This would increase density of likely paths that never call print.

@dr2chase
Copy link
Contributor

This doesn't completely eliminate the ability to do that sort of debugging, though it does make it a lot more annoying:

//go:noinline
dbgPrintln(s string) {
   println(s)
}

But how much do we gain from doing this?

@thanm
Copy link
Contributor

thanm commented Aug 14, 2017

Both clang and GCC do this optimization for C++ programs, for what that's worth. E.g.

  printf("a");
  printf("b");
  printf("c\n");

gets collapsed into a single 'puts("abc")'.

Q: why do this only with println? Why not apply the same coalescing to fmt.Printf calls?

@josharian
Copy link
Contributor Author

Tangential, but what about emitting blocks that contain print at the end of the function, as unlikely blocks? This would increase density of likely paths that never call print.

Good idea.

This doesn't completely eliminate the ability to do that sort of debugging

If done soundly (which was my intent), it doesn't impact it at all. It just limits the scope of the optimization.

But how much do we gain from doing this?

My simple optimizations cut 0.5% from hello world. Given that all of that is from the runtime, it seems low priority but still worthwhile (as I said initially).

Q: why do this only with println? Why not apply the same coalescing to fmt.Printf calls?

A few reasons. fmt is a bit high level for the compiler to be messing with. fmt's printing routines return values. In general, fmt's printing routines take care to make exactly one underlying write call.

@ALTree ALTree added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Performance labels Sep 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Performance
Projects
None yet
Development

No branches or pull requests

6 participants