New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
syscall: starting a process with large number of arguments is way too slow #41825
Comments
Change https://golang.org/cl/259978 mentions this issue: |
I tried running your test before and after @ianlancetaylor change is applied https://go-review.googlesource.com/c/go/+/259978
Not much difference in time it takes to execute your program. Until you have 10,000 arguments. I am yet to see that program that takes 10,000 arguments. Alex |
Thanks for the note.
Granted, thousands of arguments is not common, but still don't want to see
Go doing it (much) more slowly than Python :)
Bob
…On Wed, Oct 7, 2020 at 1:02 AM Alex Brainman ***@***.***> wrote:
@bobjalex <https://github.com/bobjalex>
I tried running your test before and after @ianlancetaylor
<https://github.com/ianlancetaylor> change is applied
https://go-review.googlesource.com/c/go/+/259978
100 repetitions
0 arguments: 1,590ms
10 arguments: 1,637ms
50 arguments: 1,592ms
200 arguments: 1,611ms
1,000 arguments: 1,658ms
10,000 arguments: 7,828ms
Ratio of 10,000 : 10 argument durations: 4.78
100 repetitions
0 arguments: 1,563ms
10 arguments: 1,456ms
50 arguments: 1,526ms
200 arguments: 1,575ms
1,000 arguments: 1,565ms
10,000 arguments: 6,768ms
Ratio of 10,000 : 10 argument durations: 4.65
Not much difference in time it takes to execute your program. Until you
have 10,000 arguments. I am yet to see that program that takes 10,000
arguments.
Alex
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#41825 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABLA2AKGW6FYKVXGWUHBLLTSJQOBRANCNFSM4SGTXR4A>
.
|
@bobjalex does the merged fix improve your example as expected? Alex found it had virtually no effect. |
Yes, it had a huge and easily noticeable effect on my experiments with
10,000+ arguments. For more typical numbers of arguments, there might be a
miniscule improvement, but not human-noticeable.
…On Wed, Oct 7, 2020 at 9:46 AM Liam ***@***.***> wrote:
@bobjalex <https://github.com/bobjalex> does the merged fix improve your
example as expected? Alex found it had virtually no effect.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#41825 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABLA2APY4YQJ24CXUJKG7T3SJSLNVANCNFSM4SGTXR4A>
.
|
Unless,,, you run a command with a few arguments 10,000 times.
…On Wed, Oct 7, 2020 at 10:46 AM Bob Alexander ***@***.***> wrote:
Yes, it had a huge and easily noticeable effect on my experiments with
10,000+ arguments. For more typical numbers of arguments, there might be a
miniscule improvement, but not human-noticeable.
On Wed, Oct 7, 2020 at 9:46 AM Liam ***@***.***> wrote:
> @bobjalex <https://github.com/bobjalex> does the merged fix improve your
> example as expected? Alex found it had virtually no effect.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#41825 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ABLA2APY4YQJ24CXUJKG7T3SJSLNVANCNFSM4SGTXR4A>
> .
>
|
Sorry about all the little messages, but let explain better...
I was running an experiment, written in Go, running a command repeatedly
with an increasingly large number of arguments to see when it would break
due to too many argument. The experent program produced a "progress dot"
on the console between each invocation. The progress dots were produced
much more slowly than I would have expected.
So, wrote the same experiment in Python,, and the dots were very fast.
That didn't seem right, so I located the bottleneck and reported it.
…On Wed, Oct 7, 2020 at 10:50 AM Bob Alexander ***@***.***> wrote:
Unless,,, you run a command with a few arguments 10,000 times.
On Wed, Oct 7, 2020 at 10:46 AM Bob Alexander ***@***.***> wrote:
> Yes, it had a huge and easily noticeable effect on my experiments with
> 10,000+ arguments. For more typical numbers of arguments, there might be a
> miniscule improvement, but not human-noticeable.
>
> On Wed, Oct 7, 2020 at 9:46 AM Liam ***@***.***> wrote:
>
>> @bobjalex <https://github.com/bobjalex> does the merged fix improve
>> your example as expected? Alex found it had virtually no effect.
>>
>> —
>> You are receiving this because you were mentioned.
>> Reply to this email directly, view it on GitHub
>> <#41825 (comment)>, or
>> unsubscribe
>> <https://github.com/notifications/unsubscribe-auth/ABLA2APY4YQJ24CXUJKG7T3SJSLNVANCNFSM4SGTXR4A>
>> .
>>
>
|
Change https://golang.org/cl/260397 mentions this issue: |
To clarify, I was referring to the just-merged fix, vs the fix you suggested (which is much simpler). |
I tried, but couldn't figure out how to see the latest change -- not too
proficient with github's web interface.
But I did try the only simplification I could think of:
func makeCmdLine(args []string) string {
var b []byte
for _, v := range args {
b = append(b, ' ')
b = append(b, []byte(EscapeArg(v))...)
}
return string(b[1:])
}
Seems like it might be a bit faster, eliminating a boolean test in each
iteration and performing the same number of appends minus 1,
But my timing tests show virtually no difference for 10,000 args.
But I think I like this one more though -- a bit simpler.
If the latest change is different from this, could you send me a link to a
page where I could see it?
…On Wed, Oct 7, 2020 at 12:14 PM Liam ***@***.***> wrote:
To clarify, I was referring to the just-merged fix, vs the fix you
suggested (which is much simpler).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#41825 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABLA2ALFCJ42HYN7ANW55UTSJS4YFANCNFSM4SGTXR4A>
.
|
Here's a patch for the newly merged change which you can download & apply to your local go tree (first undo the change you made): To view on github: 49225854 Hopefully this gives the same performance boost as your change; if not, please tell us what you got. |
OK -- thanks. Looks to be the same as my original suggestion with b == nil
replaced by len(b) == .0.
*Seems* to me that a comparison to nil would be a wee bit faster than a
length comparison, and it *is* guaranteed that b will be nil on the first
iteration and never again. But I doubt it will show any difference in the
benchmark -- the compiler will probably optimize either one to the same
instructions But I will try it and let you know if there is a difference.
I still think I like that last suggestion I sent to you the best :) It's
simpler, and equal speed. (and side steps that comparison issue)
But no matter which of the changes survives, it will make a big difference
in those 10K argument commands!
Bob
…On Wed, Oct 7, 2020 at 4:11 PM Liam ***@***.***> wrote:
Here's a patch for the newly merged change which you can download & apply
to your local go tree (first undo the change you made):
https://github.com/golang/go/commit/49225854.patch
To view on github: 4922585 <49225854>
Hopefully this gives the same performance boost as your change; if not,
please tell us what you got.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#41825 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABLA2AKLNV73UROE42KR35DSJTYRHANCNFSM4SGTXR4A>
.
|
Please could you test the new patch, and report the results for the benchmark you tried earlier? |
Accidentally broken by CL 259978. For #41825 Change-Id: Id663514e6eefa325faccdb66493d0bb2b3281046 Reviewed-on: https://go-review.googlesource.com/c/go/+/260397 Trust: Ian Lance Taylor <iant@golang.org> Trust: Alex Brainman <alex.brainman@gmail.com> Trust: Emmanuel Odeke <emm.odeke@gmail.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Alex Brainman <alex.brainman@gmail.com> Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
OK, I've done a bunch of timings. Results and some commentary are attached,
as well as the Go program I used to perform the benchmarks.
Quick summary: all of 3 new versions of makeCmdLine are pretty close to the
same. I can't tell which one is faster since I'm running on ordinary home
Windows computers that have things going on in the background, and
run-to-run times are variable.
If you guys have a "lab" machine that is quiet without the background
stuff, maybe you could benchmark in that environment and see more
conclusive results.
Let me know if there is anything else I can do to help. (or if I forgot
anything :)
Bob
On Wed, Oct 7, 2020 at 7:34 PM Liam ***@***.***> wrote:
Please could you test the new patch, and report the results for the
benchmark you tried earlier?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#41825 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABLA2AJ7P3R6A774Q4ILL5LSJUQMLANCNFSM4SGTXR4A>
.
Comparison Timing Tests of 3 new implementations of syscall/makeCmdLine
The test sequence was performed on 2 different home computers.
To partially mitigate the effect of the chaos of miscellaneous background
activity, lots of timings were run. The whole test set was run twice,
pass 1 and pass 2.
Each pass consists of tests of 3 implementations.
For each implementations, a benchmark program was run twice.
The benchmark program performs timings for varous numbers of command line
arguments. Each timing performs 100 launches.
The 3 implementations are:
Bob's 1st way: the implementation change first suggested by Bob
New way: the implementation as entered by the Go team, a revision of
Bob's 1st way
Bob's simplified way: the simplest implementation of the lot, same idea
simplified by removing the string length test in each iteration:
func makeCmdLine(args []string) string {
var b []byte
for _, v := range args {
b = append(b, ' ')
b = append(b, []byte(EscapeArg(v))...)
}
return string(b[1:])
}
Bob's summary: all of the 3 implementations perform about the same in my
timings. These tests are not sufficiently consistent to show a clear winner.
There is one test of the original Go 1.15.2 implemenataion in this document,
and it suggests that the improvement in the new implementations becomes
significant as argument counts exceed 1000. With fewer that 1000 arguments
the difference is not significant.
Is the change worthwhile, since more that 1,000 arguments rarely occurs?
Yes, if for no other reason than the code is better in that it is
properly scalable. Building strings by successive concatenation is a classic
performance warning. Other significant software products such as operating
systems (Unix, Windows, ...) and programming language libraries (Python,
Java, ...) have apparently found it useful to use algorithms that do not
have order n-squared perfomance, since they do support large argument
lists and handle them performantly. (Is that a word?)
Details of the experiments follow:
====================================================================
Results from Intel Core-I5 about 2 yr old
Pass 1
------
Original way (1.15.2)
0 arguments: 490ms
10 arguments: 463ms
50 arguments: 459ms
200 arguments: 481ms
1,000 arguments: 566ms
10,000 arguments: 4,995ms
16,000 arguments: 11,949ms
0 arguments: 445ms
10 arguments: 446ms
50 arguments: 449ms
200 arguments: 473ms
1,000 arguments: 610ms
10,000 arguments: 5,085ms
16,000 arguments: 12,076ms
New way
0 arguments: 447ms
10 arguments: 454ms
50 arguments: 446ms
200 arguments: 459ms
1,000 arguments: 497ms
10,000 arguments: 837ms
16,000 arguments: 1,174ms
0 arguments: 438ms
10 arguments: 446ms
50 arguments: 451ms
200 arguments: 487ms
1,000 arguments: 517ms
10,000 arguments: 891ms
16,000 arguments: 1,201ms
Bob's Simplified way
0 arguments: 450ms
10 arguments: 454ms
50 arguments: 486ms
200 arguments: 494ms
1,000 arguments: 519ms
10,000 arguments: 846ms
16,000 arguments: 1,049ms
0 arguments: 468ms
10 arguments: 467ms
50 arguments: 471ms
200 arguments: 486ms
1,000 arguments: 517ms
10,000 arguments: 855ms
16,000 arguments: 1,039ms
Bob's 1st way
0 arguments: 481ms
10 arguments: 473ms
50 arguments: 469ms
200 arguments: 480ms
1,000 arguments: 512ms
10,000 arguments: 819ms
16,000 arguments: 985ms
0 arguments: 460ms
10 arguments: 460ms
50 arguments: 464ms
200 arguments: 480ms
1,000 arguments: 502ms
10,000 arguments: 814ms
16,000 arguments: 983ms
Pass 2
------
New way again
0 arguments: 461ms
10 arguments: 456ms
50 arguments: 465ms
200 arguments: 477ms
1,000 arguments: 506ms
10,000 arguments: 817ms
16,000 arguments: 986ms
0 arguments: 461ms
10 arguments: 461ms
50 arguments: 461ms
200 arguments: 473ms
1,000 arguments: 503ms
10,000 arguments: 811ms
16,000 arguments: 1,004ms
Bob's simplified way again
0 arguments: 467ms
10 arguments: 463ms
50 arguments: 462ms
200 arguments: 477ms
1,000 arguments: 506ms
10,000 arguments: 812ms
16,000 arguments: 977ms
0 arguments: 460ms
10 arguments: 458ms
50 arguments: 462ms
200 arguments: 474ms
1,000 arguments: 503ms
10,000 arguments: 814ms
16,000 arguments: 983ms
Bob's 1st way again
0 arguments: 471ms
10 arguments: 465ms
50 arguments: 461ms
200 arguments: 474ms
1,000 arguments: 502ms
10,000 arguments: 810ms
16,000 arguments: 981ms
0 arguments: 455ms
10 arguments: 457ms
50 arguments: 459ms
200 arguments: 470ms
1,000 arguments: 500ms
10,000 arguments: 813ms
16,000 arguments: 984ms
====================================================================
Results from Intel Core-I7 about 4 yr old
Pass 1
------
New way
0 arguments: 1,229ms
10 arguments: 1,003ms
50 arguments: 980ms
200 arguments: 1,097ms
1,000 arguments: 1,031ms
10,000 arguments: 1,500ms
16,000 arguments: 1,857ms
0 arguments: 1,040ms
10 arguments: 999ms
50 arguments: 1,036ms
200 arguments: 1,016ms
1,000 arguments: 1,057ms
10,000 arguments: 1,407ms
16,000 arguments: 1,660ms
Bob's simplified way
0 arguments: 1,039ms
10 arguments: 1,018ms
50 arguments: 994ms
200 arguments: 1,032ms
1,000 arguments: 1,100ms
10,000 arguments: 1,356ms
16,000 arguments: 1,532ms
0 arguments: 1,022ms
10 arguments: 953ms
50 arguments: 1,074ms
200 arguments: 982ms
1,000 arguments: 992ms
10,000 arguments: 1,336ms
16,000 arguments: 1,555ms
Bob's 1st way
0 arguments: 1,009ms
10 arguments: 996ms
50 arguments: 1,020ms
200 arguments: 1,026ms
1,000 arguments: 981ms
10,000 arguments: 1,340ms
16,000 arguments: 1,697ms
0 arguments: 1,042ms
10 arguments: 950ms
50 arguments: 956ms
200 arguments: 1,013ms
1,000 arguments: 1,061ms
10,000 arguments: 1,378ms
16,000 arguments: 1,590ms
Pass 2
------
New way again
0 arguments: 1,203ms
10 arguments: 956ms
50 arguments: 968ms
200 arguments: 950ms
1,000 arguments: 1,017ms
10,000 arguments: 1,359ms
16,000 arguments: 1,550ms
0 arguments: 980ms
10 arguments: 961ms
50 arguments: 982ms
200 arguments: 1,143ms
1,000 arguments: 1,013ms
10,000 arguments: 1,328ms
16,000 arguments: 1,608ms
Bob's simplified way again
0 arguments: 1,060ms
10 arguments: 983ms
50 arguments: 1,034ms
200 arguments: 995ms
1,000 arguments: 1,072ms
10,000 arguments: 1,314ms
16,000 arguments: 1,514ms
0 arguments: 1,028ms
10 arguments: 981ms
50 arguments: 974ms
200 arguments: 1,072ms
1,000 arguments: 980ms
10,000 arguments: 1,344ms
16,000 arguments: 1,538ms
Bob's 1st way again
0 arguments: 968ms
10 arguments: 1,002ms
50 arguments: 1,050ms
200 arguments: 1,011ms
1,000 arguments: 1,018ms
10,000 arguments: 1,326ms
16,000 arguments: 1,534ms
0 arguments: 1,054ms
10 arguments: 995ms
50 arguments: 1,048ms
200 arguments: 1,065ms
1,000 arguments: 1,016ms
10,000 arguments: 1,313ms
16,000 arguments: 1,497ms
|
Thanks, that clarifies the issue :-) |
Well, if I read it correctly there is "at least" 10-12x improvement already, which is not bad. (10-12 sec to 1 sec) Now, to think of it... there is a whole lot of memory buffer expansion is going on. Makes me wonder if doing something like this would help (pseudo code, just an idea, might not compile) func estimateLen(args []string) int {
// very rough preestimate for a buffer length
var estimate = len(args) // spaces between args
for _, v := range args {
var estimateQuote, estimateSlash int
for _, s := range v {
switch s {
case '\\':
estimateSlash += 1
case '"':
estimateSlash += 1
needQuote = 1
case ' ', '\t':
needQuote = 1
}
}
estimate += len(v) + needQuote*(estimateSlash+1)
}
return estimate
}
func makeCmdLine(args []string) string {
estimate := 1024
if len(args) > 100 {
estimate = estimateLen(args)
}
var b []byte = make([]byte, 0, estimate)
for _, v := range args {
b = append(b, ' ')
b = append(b, []byte(EscapeArg(v))...)
}
return string(b[1:])
} (but I suspect conversion Sorry, I don't have Windows machine handy to try it myself |
I did some experiments to determine how helpful preallocation is for this
case. Results attached...
-- Bob
On Fri, Oct 9, 2020 at 10:53 PM tandr ***@***.***> wrote:
Well, if I read it correctly there is "at least" 10-12x improvement
already, which is not bad. (10-12 sec to 1 sec)
Now, to think of it... there is a whole lot of memory buffer expansion is
going on. Makes me wonder if doing something like this would help (pseudo
code, just an idea, might not compile)
func estimateLen(args []string) int {
// very rough preestimate for a buffer length
var estimate = len(args) // spaces between args
for _, v := range args {
var estimateQuote, estimateSlash int
for _, s := range v {
switch s {
case '\\':
estimateSlash += 1
case '"':
estimateSlash += 1
needQuote = 1
case ' ', '\t':
needQuote = 1
}
}
estimate += len(v) + needQuote*(estimateSlash+1)
}
return estimate
}
func makeCmdLine(args []string) string {
estimate := 1024
if len(args) > 100 {
estimate = estimateLen(args)
}
var b []byte = make([]byte, 0, estimate)
for _, v := range args {
b = append(b, ' ')
b = append(b, []byte(EscapeArg(v))...)
}
return string(b[1:])
}
(but I suspect conversion []byte(EscapeArg(v))... might make whole thing
a matter of "noise")
Sorry, I don't have Windows machine handy to try it myself
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#41825 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABLA2ANBF4PPSUNGWNDFUILSJ7ZGNANCNFSM4SGTXR4A>
.
To pre-validate your suggested approach, I did an experiment with byte
slice preallocation in a very crude way specific to this test that would
result in no subsequent allocations. The reasoning is that this simplified
preallocation would show at least as much improvement as the more accurate
way, since it omits scanning the arguments. Below are the algorithms I
compared and the results.
The best 16,000 arg result from each test sequence shows preallocation:
967ms, no preallocation: 975ms -- a 0.82% improvement.
BTW: in case you're not a Windows person, Windows command lines are limited
to 32,767 bytes, so, since args are space separated, 16,000 is pretty close
to the max.
Does the improvement warrant the change?
My initial thought is to not perform preallocation:
- The less than 1% improvement would not likely be noticable for normal
numbers of arguments, and also probably not either for huge nummbers.
- Calculating the estimate greatly increases complexity of the code.
The code I used for the comparison:
// Simplified way with no preallocation
func makeCmdLine(args []string) string {
var b []byte
for _, v := range args {
b = append(b, ' ')
b = append(b, []byte(EscapeArg(v))...)
}
return string(b[1:])
}
// Simplified way with very crude preallocation taylored for just this test:
// 2 bytes per argument (each space-separated arg = "x" thus is unquoted)
// plus 100 extra)
func makeCmdLine(args []string) string {
var b = make([]byte, 0, len(args)*2+100)
for _, v := range args {
b = append(b, ' ')
b = append(b, []byte(EscapeArg(v))...)
}
return string(b[1:])
}
===============================
Simplified way with very crude preallocation:
0 arguments: 447ms
10 arguments: 453ms
50 arguments: 467ms
200 arguments: 471ms
1,000 arguments: 501ms
10,000 arguments: 849ms
16,000 arguments: 1,120ms (OS must have stolen some cycles here :)
0 arguments: 454ms
10 arguments: 455ms
50 arguments: 465ms
200 arguments: 464ms
1,000 arguments: 511ms
10,000 arguments: 809ms
16,000 arguments: 967ms
0 arguments: 457ms
10 arguments: 457ms
50 arguments: 457ms
200 arguments: 470ms
1,000 arguments: 502ms
10,000 arguments: 783ms
16,000 arguments: 969ms
Simplified way without preallocation.
0 arguments: 465ms
10 arguments: 458ms
50 arguments: 463ms
200 arguments: 470ms
1,000 arguments: 495ms
10,000 arguments: 796ms
16,000 arguments: 977ms
0 arguments: 459ms
10 arguments: 455ms
50 arguments: 455ms
200 arguments: 470ms
1,000 arguments: 499ms
10,000 arguments: 805ms
16,000 arguments: 975ms
0 arguments: 461ms
10 arguments: 450ms
50 arguments: 455ms
200 arguments: 471ms
1,000 arguments: 502ms
10,000 arguments: 807ms
16,000 arguments: 980ms
|
Thank you @bobjalex. With these numbers I guess we should put it to rest. (unless you really want to run profiler to see where the time is mostly spent.) |
This issue doesn't need any further work given the results above. |
I agree with putting it to bed -- the command line generation is nice and
fast now, or will be when it gets into a public release.
Just installed Go 1.15.3. To tweak my Go install to get external process
invocation performance "at least as good as Python", here's what I do:
- Remove the 5ms delay from os.exec_windows.go. I've been doing that for
a long time now (couple of years), and never have had a related problem (I
have 2 less than 5 year old Windows Intel-based home computers).
- Patch into syscall.exec_windows.go our recent makeCmdLine speedup.
About that 5ms delay:
I have been assured in past conversations that without the delay errors
sometimes occur when deleting the executable immediately after return from
process wait.
The comment in the code says "// NOTE(brainman): It seems that sometimes
process is not dead when WaitForSingleObject returns....
I suspect that a more accurate statement would be "the process is complete
but the executable has not yet been closed".
Is the contract of "wait" that the executable is closed on return?
By far most external process launches don't care whether the executable is
closed before completion is signalled. Closing of the executable is part of
the OS's cleanup after running a process and may be done concurrently after
wait returns -- the caller should not have to wait for that
For Windows, those few programs that run processes and then delete the
executable (such as "go run"?) might have to deal with an executable that
is still open. Note that Unix does not have this concern, since it's OK to
remove an open file (then the file is later physically deleted as soon as
it has no openers left).
Suggestion:
- remove the unconditional 5ms delay
- in the few programs that want to delete the executable after
completion, the *client* should:
- be sure to check the status of the deletion operation
- if error:
- retry a few times, with a small delay between retries (assuming
that the executable will soon be closed)
- give up if success does not happen soon, announcing the failed
deletion (if possible)
…On Sun, Oct 11, 2020 at 12:48 AM Liam ***@***.***> wrote:
This issue doesn't need any further work given the results above.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#41825 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABLA2AJRX4EHAG6MJ6OQBIDSKFPNHANCNFSM4SGTXR4A>
.
|
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
started a process with lots of arguments (thousands)
What did you expect to see?
Process runs
What did you see instead?
Process runs but starts slowly, much slower that equivalent launch in Python.
And I have the solution...
In investigating the said slowness, I gathered some timings,
running on my fairly modern Windows 10 laptop:
The fix is a few lines in the syscall/exec_windows.go file, at line 86 in my 1.15.2 source:
Replace:
with:
I hope someone on the Go team can make this modification.
Here is the program I used for the benchmarks:
{you will have to provide your own "nop" command -- set the nopCommand constant)
The text was updated successfully, but these errors were encountered: