New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Go2: Vector Basic Type - Similar to []T but with enhancements #35307
Comments
The long list of restrictions is pretty complicated and not very Go like. In Go we aim for orthogonality of language features. I particularly note "If a vec is set to nil, then all pointers (eg *float32vec) that point to it are set to nil and no longer associated with vec"; I don't see how that can be implemented. The usual problem with general purpose vector types is that different processors implement different kinds of support. That makes it hard to use general purpose types and be confident that they will work efficiently on all processors. So if you care about portability you wind up writing processor-specific code anyhow. And if you are going to write processor-specific code, it's not obvious why the language needs to support it. For example, we could instead provide processor-specific versions of the vector intrinsic functions used in C (e.g., https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/x86-Built-in-Functions.html#x86-Built-in-Functions). |
I am not a compiler developer, so those restrictions are more suggestions to keep numerical models and common array pitfalls form occurring (What I frequently see in debugging). Even in Fortran, you are supposed to nullify all pointers (as I can see how the infered pointer nil'ing would be a problem, and was more hopeful as that would improve robustest of code, but requring an explict nil or auto-nil'ing when they go out of scope could work. For example in Fortran
In terms of implementation, it would have to be pushed on the compiler to optimize the code based on the architecture that it is on. This is what is done in Fortran, you write the code, and the compiler translates it to the appropriate OS/CPU and optimizes the hell out of it. Its why code written in the 1980s in Fortran is still usable today and did not have the hangups that C did when machines went to 64bit. In fact some code I compile into my projects was written in the early 90s; one program that I can compile, but dont use, solves the determinant and was written in the 1960s! The Fortran compiler knows that the arrays are contiguous in memory and imposes additional restrictions on the pointers such that the binary is optimized for the target OS/CPU. For example, my current project is a regional surface and groundwater simulation software platform composed of ~300,000 lines of Fortran that compiles on Windows and Linux on Intel/AMD64 cpus. It then auto-vectorizes any array manipulations (the DO CONCURRENT or similar to the examples I gave earlier). It was something I noticed that would extend the utility of GO by including some sort of vector support. The one thing I dont like about Python, is everyone just turns to C for real array operations (hence why numpy is faster than numpy cause its really just a c-wrapper on top of Fortran). Another option, is to develop bindings for fgo, which allows go to pass slices to fortran subroutines. From what I read about recommendations about CGO I would not advocate that. (Personally I would love it, but it sounds like the benifits are lost in translation), |
Slices provide contiguous memory and pass by reference already. Have you looked at GoNum? It provides a lot of what you are describing as far as optimizations go. I think that just leaves the operator overloading. |
The benefit of this proposal are the operators. Go lacks the ability of defining operators (not overloading!). GoNum is no help. Using methods makes the mathematical expressions unreadable, untestable and at the end wrong, because methods have no precedence. |
So could this proposal just be operator overloading on slice types? Do we need a separate vector type?
|
Its more that by creating a "vec" type also clues in the go compiler to make additional optimizations at the sacrifice of flexibility. The compiler than can offer flags for additional cpu optimization flags, such as SSE/AVX/etc. In Fortran, there is automatic vectorization, so when you compile you specify the instructions it should try to use (or minimum CPU it should support) and it looks for contiguous blocks and vector math and adds into the assembly the instructions for the CPU (such as AVX2). Like that previous Fortran code I set up, would automatically be passed as AVX2 instructions. |
We should certainly consider adding vectorization to the gc compiler. Note that you can already get vectorization with Go by using gccgo or GoLLVM. Adding new types to the language is a much higher bar. |
Does this only apply to the C code developed for go or the go code itself? I saw the LLVM, and put on my todo list to figure out their Fortran LLVM (mostly curious if it can complete with then private compilers, like Intel). |
>Does this only apply to the C code developed for go or the go code itself?
The Go code.
https://godbolt.org/z/gRFcMy
Than
…On Mon, Nov 4, 2019 at 3:00 PM Scσtt Bσyce ***@***.***> wrote:
We should certainly consider adding vectorization to the gc compiler. Note
that you can already get vectorization with Go by using gccgo or GoLLVM.
Adding new types to the language is a much higher bar.
Does this only apply to the C code developed for go or the go code itself?
I saw the LLVM, and put on my todo list to figure out their Fortran LLVM
(mostly curious if it can complete with then private compilers, like Intel).
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#35307?email_source=notifications&email_token=AC5WC3AID4FOYE5RUQUC5YTQSB5NNA5CNFSM4JH4VPQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDAQRQQ#issuecomment-549521602>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AC5WC3DZOVCF3DAURO54Y5TQSB5NNANCNFSM4JH4VPQQ>
.
|
@thanm I've tried to read the disassembles, but sorry, my Z80 and K68 times are long ago. |
@maj-o the intent of my post was just to show that the gccgo backend is using the same sorts of vector instructions that you would expect to see for a comparable C/C++ example compiler with "gcc -O3". |
In practice, depending on auto-vectorization becomes a game of tug-of-war with the optimizer to generate the assembly one wants, across future changes to one's code. Most code that auto-vectorization takes care of is usually manually vectorized anyways, so I would prefer to see manual vectorization surfaced in the language/standard-library (possibly via intrinsics) as opposed to assembly. However, I think that dynamic vectors introduce loads of complexity to the optimizer as to what happens across architectures. Even with packed SIMD intrinsics, the fallbacks and API surface area get unwieldy, like with Vec128 (see https://godoc.org/github.com/smasher164/simd). I would be more open to first introducing fixed-width intrinsics. |
Note that an alternative that would solve many of these issues is with a more general approach of considering operator functions: #27605 It may be a good idea to look for a more generic language feature rather than having tons of built-in types that all have their own operators defined for them by the language. |
While operator overloading would be really nice, its more setting up a datatype with limitations that aid the compiler for determining optimization and auto-vectorization. I probably would recommend making tons of vec versions, but rather 5 of them, maybe one for byte, int32, int64, floast32, float64. Another option is to make them an alias to the slice versions of the same name, but imposes more restrictions. Directives are nice to some degree, but I know the golang is pretty anti-directive. While I think it be better to impose a consistent protocal for the life of a variable, There could be some benivit for the programmer to say, On a side note, this is Fortran Operator Overloading (a clipped version of my Fortran library that mimics Python DateTime (The bottom set of GENERIC pointers are the operator overloading):
|
This proposal has a long list of rules which I think are intended to force the use of vector instructions. But those rules are complicated, and seem hard to remember, and don't seem particularly Go like. It might be simpler to just permit some binary operations on slice and array types. That would make it easier for the compiler to vectorize those operations when possible. Though it would have other drawbacks, as the time required for the operation would depend on the length of the slice. If the desire is to be sure that operations use vector instructions, then I don't think this is the right approach. We need something simpler. |
I am open to suggestions, this was more to start a discussion thread of ways to make GO more suitable for numerical simulations without having to co-compile with C/Fortran (like what is done with Python to make it able to compete with Fortran). It would be great to have the language have some level of support for fast vector math or fast looping through a vector or multi-dimensional array. I was not sure if the slice operations could be modified. There could be something added to the make() function for slices that indicates that the slice should be treated as a vectorizable array. Another option also is to have make() use the capacity as a minimum and to pick the optimal size for vectorization or the type of vectorization requested. So something like
would use the best requested instruction. A better option is to just an on/off flag that must be used in tandem with a compiler flag (here I just use the word fast to indicate the array should use fast systemics).
then when compiled it would by default ignore the "fast" but could have something like go -fast:avx myCode.go which would inform the compiler to avx any array tagged with fast (It also may increase the size of CAP to make the operations better. If its necessary to disable the optimization loops could have a macro directive as I mentioned earlier in
but could be something more like: |
The question, for me, although it might not belong to this issue, is whether it makes sense to add something like a That's not going to take advantage of every possible SIMD optimization, but in my opinion anything more complex than that belongs in a package like gonum. And if that's not possible/good enough then I'm happy doing numerical programming in julia/python/something else and using golang for what I'm using it now. As a side note, I'm not a big fan of operator overloading in general, more so if there's no way of making sure certain properties hold (e.g. I've seen it gone wrong even in very simple cases like overloading of operators for atomic values in C++, where it makes it much harder to detect at a simple glance when an operation is guaranteed to be atomic and when it isn't when compared to using |
@Scott-Boyce Adding options to @IanTayler 's suggestion of a vector package is something that could more easily be done. The vector package would provide a comprehensible API, and anybody who choose to use it could do so. The language would not change. |
@ianlancetaylor That works for me as well. Most people dont use the directives in Fortran, but rather rely on compiler flags for the compiler target to do the optimizations (it would be my preference to have it htat way, I just was unsure if the GO compiler required some sort of flag scheme in the code to help it). |
Regarding a standard vectorization package, it is worth looking at prior art in the form of the WebAssembly SIMD Proposal, which itself is inspired by Dart's SIMD Numeric Types. They both acknowledge that a 128-bit lane width is enough for most people. However, a standard vectorization package isn't feasible either without some form of generics or assistance from the language. For example, instead of defining an package simd
contract V128(V, T) {
V [4]float32, [2]float64, [16]int8, [8]int16, [4]int32, [2]int64, [16]uint8, [8]uint16, [4]uint32, [2]uint64
T int8, int16, int32, int64, uint8, uint16, uint32, uint64, float32, float64
}
func ExtractLane(type V, T V128)(vec V, i int) T { ... } Usage would look like: vec := [4]float32{1,2,3,4}
simd.ExtractLane([4]float32, int16)(vec, 2) Aside: It would be nice to be able to omit the type argument for [4]float32, since it could be deduced from vec, but I don't think the draft allows something like These generic functions would define a scalar fallbacks in the package, but would need the compiler's assistance to be intrinisified (like math/bits) into SSE, NEON, MSA, Altivec, VIS across different GOARCHs. Runtime feature detection would likely slow down these operations unless the instructions can be scheduled to batch them together. Alternatively, we could introduce an environment variable to disregard feature detection at build-time. Perhaps this issue can be repurposed into a proposal for a vector package? |
I recommend that we use a different issue for a different vector proposal. Based on the discussion above, and the complexity of the proposal, this particular proposal is a likely decline. Leaving open for four weeks for final comments. |
I think it is worth thinking about a solid and simple base for numerical types in go (without braking anything). Don't get me wrong, i love go. |
There's an incredible opportunity for Go in this space. The current goto numerical languages (python/r/julia) are all flawed in ways that make them painful to write. Bringing all the things Go has done right into numerical computing would have a big impact. I agree that operator overloading can go very wrong after writing Scala for a couple years. Which means they probably need to be a native type as proposed with the vector package. Having pragmas that can support different hardware accelerators would be incredible. |
This is a strong and evidentially empty statement. |
We agree that there is an opportunity here, but we don't know what it looks like. Also, this is an area that is likely to be affected by generics, which are in progress. For this specific proposal, there is no change in consensus. Closing. |
Background
Forgive me if I don't do this right, its actually my first post on github, let alone a feature development proposal. I primarily develop in Fortran 95-2008 with python for pre/post processing of data and glue codes. I have been experimenting with Go for the past three months and think the language can have some powerful extensions beyond server side programming.
Proposal
Adding a "vect" basic type that are analogous to standard slices, but are optimized for vector math, contiguous memory, pass by reference and compile with cpu SIMD instructions or addition optimizations. In addition there could be compiler flags to customize instructions for specific OS or CPUs (eg AVX2), which then would make it easier on the developer without having to dive into assembly code.
Potential type names could be
The use of the vec types would necessitate using make to initialize the base vector that would point to the underlying array, which would be guaranteed to be in contiguous memory and once made can only point to that underlying array or nil (nothing else). Operations that break the contiguous feature and constant dimention, such as append, should be disabled.
A vector size is set at run-time, but cannot be dynamically changed once established.
For example:
Would create a x and y as float64 vector-arrays and ivec and jvec as int64 vector-arrays. If possible it might be good to make part of the basic type implicit padding for (such as shifting the 100 dim to the best cache size, but still limiting the size to 100)
Like a slice a vec would point to an underlying array, but there would be additional restrictions:
A vec that has not been allocated with
make()
is of nil type and is not usable (like amap
)No dimension changing; after creating with
make()
cannot change len/capUnderlying array is always in contiguous memory
Vec passed by reference, loops do not create copy variables (see later example) only pointer reference
The life of a vec variable is nil to start and then once allocated with
make()
cannot be associated with any other memory location unless set to nil first.make()
to change its size, but it should be thought of as being a new variable (makes it simpler for name reuse).*float32vec
) that point to it are set to nil and no longer associated with vec.Pointers can only point to contiguous portions of allocated vectors
Pointer versions cannot be allocated with
make()
Pointers may only point to
nil
-if vec is allocated with
make(vec, len)
, then pointer will point to newly allocated memory along with vecp1 = &p2
and then laterp2 => &vec
, then by associationp1 => &vec
Vec pointers are automatically dereferenced when a dimension is specified.
p1[3] = 5
is syntactic sugar*p1[3] = 5
*p1 == p[:]
-No dimension functions just like a pointer that requires an address
Pointers must point to an existing dimension or panic (example are
p1, p2 *vecint32, vec vecint32
)p1 = &p2
is ok ifp2 = nil
orp2 => &vec
p1 = &p2[2:6]
is ok only ifp2 !=nil
andp2 = &vec
and spans at least 5 elementsvec := make(vecint32,10); p2 => &vec; p1 => &p2[2:6]
vec := make(vecint32, 20) // vec contains 20 elements
p2 => &vec[2:] // p2 points to the vec[2]
p1 => &p2[4:8] // p1 points to the p2[4], which points to vec[6]
Vector Operations
The advantage of a vector type is to allow vector operations (element wize operations and setting values).
For example:
Scalar operations that use the same base type are applied to the entire vector.
For example:
If a vector is combined with a scalar of the same time, then the scalar is applied to the entire vector.
For example:
The advantage is that these operations may lend to faster vector processes by the compiler.
Looping and Reference Values
Looping with range should be syntactic sugar for referencing an index.
For example:
Again the main goal of this is to take advantage of looping over contiguous memory for long vectors of numbers
Possible multi-dimension extensions
While I would not advocate this, it does open the possibility of creating an alias to a vec type that creates pointers to a vector to be referenced by multiple dimensions.
Not sure how this would be done, but one possiblity is to set one of the dimensions like how an array is declared
For example:
This would create a vector that in contiguous memory for 32 float64, but can be referenced in groupings of 8.
Such as:
x[1,3]
would be syntactic sugar forx [1*8 + 3 - 1]
or simplyx[12]
Under the hood it would just be a contiguous memory vector, but the multi-index would open the doors to many numerical applications and potential compiler optimizations.
Hopefully this is a useful suggestion and would open the door for GO applications in numerics as well as adding faster vector math for its current applications.
The text was updated successfully, but these errors were encountered: