proposal: Go 2: Multi-dimensional slices #41129

RaananHadar · 2020-08-29T18:27:44Z

This proposal was originally denied back in 2014, not because it was unjustified, but because the go team said that go1 was, justifiably, stable, with the argument that consideration of this should be deferred to go2.

Its 2020 and data science, machine learning and AI are more important today than ever.
There are some really compelling arguments for why go can be the next disruptive language for data science.

I would like to thank @btracey and the gonum team for their originally written proposal.

Please do reconsider.

seankhliao · 2020-08-29T20:23:56Z

Unless the concerns in #6282 are addressed, the mere passage of time should not change the decision that was made before

RaananHadar · 2020-08-29T21:06:29Z

#6282 is a long discussion so it deserves a recap.

There are no critical standing concerns to be addressed, the go team acknowledged that this is a valid point and though the proposal is not 100% perfect, it is justified and should be brought up to more mature discussion when people will start discussing go2 more seriously.

Here is the original decision, to quote @griesemer :

We suggest that this discussion continue with the intent to come up with a new and improved design (possibly along the lines of one of the alternatives above) with the intent of having a blueprint for consideration for Go 2.

This issue fell between the cracks and was frozen due to age. It should be revived.

ianlancetaylor · 2020-08-29T21:57:51Z

But someone still needs to come up with a new and improved design. Opening this issue doesn't help without that.

RaananHadar · 2020-08-30T07:55:25Z

This is a difficult feature. Somewhat analogous to generics, a good design is not something that appeared over night. It required significant engineering efforts, including involvement also from the core go dev team. And a since a good design is going to require a community effort, this will not be solved unless a problem is categorized, discussed and tracked, which it was not.

I am arguing that this is a valid problem that was put aside because it was a tough problem that came in a time where the go language was not open to changes. Though I have the capability to present a design, I think this would be presumptuous and I believe that bringing everyone to discuss this is the important first step.

It is true that I should involve the gonum team in this. I am hopeful.

ianlancetaylor · 2020-08-31T18:02:07Z

I agree with what you say. My point is a different one. This isn't how we use the issue tracker, and, in general, this kind of discussion does not work on the issue tracker. The problem is that the issue tracker is not threaded, and the GitHub interface starts hiding comments when there are too many of them. A better place for this kind of discussion is a mailing list.

So by all means have the discussion. Just don't do it here. Thanks.

RaananHadar · 2020-08-31T20:35:35Z

@ianlancetaylor I completely agree with you on this one. I think the original issue staled because it was too big of a discussion. My plan for now is to make an Experience Report which was not submitted for the original issue, because experience reports did not exist at the time of original proposal. And move with the community to a refined proposal.

EDIT: having gone more deeper into all of the prior proposals, I just noticed now that @ianlancetaylor was behind #13253, I apologize for not noticing that and crediting your work in the original post. I was not fully aware of the alternative proposal at the time.

griesemer · 2020-09-01T23:36:24Z

For the sake of discussion, let's assume Go had support for multi-dimensional slices. Such slices would need to support a certain set of operations, such as creation/allocation of those slides, the ability to index elements, possibly iteration over the elements, maybe element-wise operations (adding two matrices), etc.

Also for the sake of discussion, I am going to make a few claims:

Multi-dimensional slices are nice for small problems but don't scale. Real problems require detailed control over the layout of matrices: There is a reason why numerical libraries such as LINPACK, LAPACK, etc. but also gonum support various representations of matrices. There are dense and sparse matrices. There may be distributed matrices. The representations of these matrices can be very different from a basic multi-dimensional slice - at best they may use multi-dimensional slices underneath.
AI applications, while operating on matrices, depend on maximum throughput and (typically) specialized hardware. To achieve the throughput or drive the hardware, the data must be brought into a suitable form which may be different from the "built-in" representation of matrices/multi-dimensional slices of a programming language. Again, real applications require detailed control over the layout of such data structures.

In summary, what we need is full programability of (memory) layout. And with programmability of layout we also need programmability of operations, as they depend on the layout.

Go already provides support for that: it's called user-defined data types. With generics, we can also parameterize these data types (vectors, dense and sparse matrices, etc.) by element type. Using methods, we also can implement the necessary operations.

So if anything is missing it's perhaps convenience of notation, i.e., syntax.

Operator methods are not sufficient. Simply permitting operator "names" such as +, *, etc. as method names (as in func (x *Vector) + (y *Vector) *Vector, so that we can write x + y which would mean x.+(y)) is a fairly easy language change but unfortunately it also doesn't scale. First, we don't just need operator methods, we probably also want operator method overloading: given a matrix m and a scalar value x, m + m would invoke the + method with a matrix receiver and argument, while m + x would have to invoke a method with a matrix receiver and a scalar argument. That is, we need two + methods (same name, different signatures), or perhaps a generic + method with method-specific type parameters. Operator overloading is not something we would be willing to add to Go (it would cause serious damage to the simplicity of method lookups); and generic methods are not sufficiently understood to be considered (see the generics draft design). One could make the 2nd operator parameter an empty interface and use a type switch inside the operator method, but that seems also not very satisfactory.
But it gets worse: for performance, we would not write (for example) a matrix multiplication as B = A*x (translated to B = A.*(x)) because it would require a (possibly large) allocation for the result B. For performance, we would want to pass in the result memory as an argument (perhaps the receiver) to the operations. That is what math/big and the gonum package do. In short, unless there's yet another mechanism to somehow control how results are allocated, operator methods fall flat in real applications.

To make progress here we need to have strong arguments as to why these claims are wrong.

There's one thing we could do on the syntax side: we could perhaps support indexing methods. If we declare methods such as [] and []=

type Matrix[T Numeric] struct ...
func (m *Matrix[T]) [] (i, j int) T ... // getter: x = m[i, j] translates to x = m.[](i, j)
func (m *Matrix[T]) []= (i, j int, x T) ... // setter: m[i, j] = x translates to m.[]=(i, j, x)

we could implement multi-dimensional data structures in libraries yet provide notationally convenient access to the elements. Such methods would only accept a prescribed signature (n integer arguments and a result or argument of a given element type). (See also https://www.dotconferences.com/2016/10/robert-griesemer-prototype-your-design.)

ianlancetaylor · 2020-09-15T23:02:46Z

Based on the discussion above, this proposal is a likely decline. Discussing the topic is fine but there is no detailed proposal here. Leaving open for four weeks for final comments.

RaananHadar · 2020-09-23T08:18:56Z

Thank you for giving me the time to do the required homework on this. The post by @griesemer and your past works gave me extremely valuable food for thought. I have written and published an experience report on this issue and linked it to the official experience reports page. I could not have done this work without all the collective past contributions of the Go community and the core Go dev team.

I became convinced, only after the completing it, that the solution proposed by @griesemer:

Solves the right problem.
Solves it properly.

I am proposing that you consider index operator methods as proposed by @griesemer as the right solution to this problem and I do hope that you reach the same conclusion after you read my experience report.

Thanks again for this opportunity.

changkun · 2020-09-24T09:50:19Z

A 15 Years of array programming experience report (and what makes python success today):

Harris, C.R., Millman, K.J., van der Walt, S.J. et al. Array programming with NumPy. Nature 585, 357–362 (2020). https://doi.org/10.1038/s41586-020-2649-2

chewxy · 2020-09-28T11:22:55Z

Hello, author of gorgonia.org/tensor here. Hope you don't mind me chiming in.

The tensor package has a very numpy-ish API, and scales indefinitely with number of dimensions (i.e. you can have a 4/5/6/...128 dimensional array.

I think @griesemer merely scratches the surface of the issues with making multidimensional slices. I also think there may be some benefits to additional accessor syntax. Just not in the way most would imagine it.

Here's an example:

Assume T is a 3 dimensional array with shape (5, 10, 10). With the tensor package, you can write something like this to slice the array:

T.Slice(S(0,3), nil, S(5,9))

In Python/Numpy, it'd look like this:

T[0:3, :, 5:9]

The latter is very much cleaner than the former.

Ideally with the [] and []= syntax extensions, they should be parameterized over a given tensor rank (or dimension, if you are coming from the numpy world).

Just having it as a plain accessor (as suggested) isn't going to do much though. You could currently do something like this:

T.SetAt(value, i,j,k)
x, err := T.At(i,j,k)

Going from that to this:

T[i][j][k] = value
x := T[i][j][k]

brings little utility. When programming with arrays, you seldom set values like this. Instead you typically set values in batches. So, using the same T as an example, you need to be able to syntactically support doing something like this:

T[i] = ... // multidimensional value here

This is doable with some slight modification to the extension that @griesemer proposed, e.g.

type Dense[T any] struct ...
func (d *Dense[T]) [] (coords ...type U) ... // getter or slicer
func (d *Dense[T]) []= (val T, coords ...type U) ... // setter

where U is defined

type U interface{
    int, Slice
}

Numpy's killer feature is the ability to slice with syntax and I think Go should have them too.

Of course, the problem is if you allow this, then abuses of syntax extensions will be abound. So we gotta balance that as well.

RaananHadar · 2020-09-28T11:44:25Z

Thanks @chewxy. I completely agree with you. I apologize for not being clear on this.
My experience report stresses that the index operators require parametrized arguments for many common use-cases to really bring significant benefit to the user.

Whether this is:

type Dense[T any] struct ...
func (d *Dense[T]) [] (i, j int)

or

func (d *Dense[T]) [] (coords ...type U)

Doesn't change much on my part. As long as we are adding [] and []= with a declared signature, the rest is all functionality that already exists today in Go and can be implemented by the user.

As for syntax abuse, you can abuse interface{} with function calls today. I think Go programmers have learned not to abuse it and I don't think its any different.

RaananHadar · 2020-10-02T02:04:27Z

You can find a proposal which is based on the proposal template here.

Thanks again for your time.

RaananHadar · 2020-10-03T18:25:32Z

Please note, I have made some significant additions to the proposal to better address the ability to add custom slicing operations. This better addresses my experience report and now fully supports the use-case detailed by @chewxy .

EDIT: For a discussion of the proposal please follow this link.

ianlancetaylor · 2020-10-13T23:26:41Z

There was further discussion, but no change in consensus on this particular proposal.

gopherbot added this to the Proposal milestone Aug 29, 2020

gopherbot added the Proposal label Aug 29, 2020

RaananHadar changed the title ~~proposal: Go 2 : Multi-dimensional slices~~ proposal: Go 2: Multi-dimensional slices Aug 29, 2020

ianlancetaylor added v2 A language change or incompatible library change LanguageChange labels Aug 29, 2020

griesemer mentioned this issue Sep 1, 2020

proposal: Go 2 review meeting minutes #33892

Open

ianlancetaylor added the Proposal-FinalCommentPeriod label Sep 15, 2020

ianlancetaylor closed this as completed Oct 13, 2020

golang locked and limited conversation to collaborators Oct 13, 2021

gopherbot added the FrozenDueToAge label Oct 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: Go 2: Multi-dimensional slices #41129

proposal: Go 2: Multi-dimensional slices #41129

RaananHadar commented Aug 29, 2020 •

edited

seankhliao commented Aug 29, 2020

RaananHadar commented Aug 29, 2020 •

edited

ianlancetaylor commented Aug 29, 2020

RaananHadar commented Aug 30, 2020 •

edited

ianlancetaylor commented Aug 31, 2020

RaananHadar commented Aug 31, 2020 •

edited

griesemer commented Sep 1, 2020

ianlancetaylor commented Sep 15, 2020

RaananHadar commented Sep 23, 2020 •

edited

changkun commented Sep 24, 2020 •

edited

chewxy commented Sep 28, 2020

RaananHadar commented Sep 28, 2020 •

edited

RaananHadar commented Oct 2, 2020 •

edited

RaananHadar commented Oct 3, 2020 •

edited

ianlancetaylor commented Oct 13, 2020

proposal: Go 2: Multi-dimensional slices #41129

proposal: Go 2: Multi-dimensional slices #41129

Comments

RaananHadar commented Aug 29, 2020 • edited

seankhliao commented Aug 29, 2020

RaananHadar commented Aug 29, 2020 • edited

ianlancetaylor commented Aug 29, 2020

RaananHadar commented Aug 30, 2020 • edited

ianlancetaylor commented Aug 31, 2020

RaananHadar commented Aug 31, 2020 • edited

griesemer commented Sep 1, 2020

ianlancetaylor commented Sep 15, 2020

RaananHadar commented Sep 23, 2020 • edited

changkun commented Sep 24, 2020 • edited

chewxy commented Sep 28, 2020

RaananHadar commented Sep 28, 2020 • edited

RaananHadar commented Oct 2, 2020 • edited

RaananHadar commented Oct 3, 2020 • edited

ianlancetaylor commented Oct 13, 2020

RaananHadar commented Aug 29, 2020 •

edited

RaananHadar commented Aug 29, 2020 •

edited

RaananHadar commented Aug 30, 2020 •

edited

RaananHadar commented Aug 31, 2020 •

edited

RaananHadar commented Sep 23, 2020 •

edited

changkun commented Sep 24, 2020 •

edited

RaananHadar commented Sep 28, 2020 •

edited

RaananHadar commented Oct 2, 2020 •

edited

RaananHadar commented Oct 3, 2020 •

edited