New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: x/exp/xiter: new package with iterator adapters #61898
Comments
The duplication of each function is the first thing that catches the eye. Are there thoughts on why this is acceptable? |
What about an adapter that converts an |
Some typos: EqualFunc2, Map2, Merge2, and MergeFunc2 lack the 2 suffixes on their actual names. They're all correct in the corresponding documentation. |
May I humbly suggest that the name "iterutils" is less susceptible to, uh, unfortunate mispronunciation. |
For |
I'd actually prefer Edit: I just realized that if |
This proposal has been added to the active column of the proposals project |
The more I think about it, the more that I think that API design for this should wait until after a decision is made on #49085. Multiple other languages have proven over and over that a left-to-right chained syntax is vastly superior ergonomically to simple top-level functions for iterators. For example, compare nonNegative := xiter.Filter(
xiter.Map(
bufio.Lines(r),
parseLine,
),
func(v int) bool { return v >= 0 },
) vs. nonNegative := bufio.Lines(r).
Map(parseLine).
Filter(func(v int) bool { return v >= 0 }) Go's a little weird because of the need to put the lines := bufio.Lines(r)
intlines := xiter.Map(lines, parseLine)
nonNegative := xiter.Filter(func(v int) bool { return v >= 0 }) That works, but it clutters up the local namespace and it's significantly harder to edit. For example, if you decide you need to add a new step in the chain, you have to make sure that all of the variables for each iterator match up in the previous and succeeding calls. |
What type does |
You would probably have to wrap the base iterator like:
|
Sorry. I should have stuck a comment in. I was just coming up with some hypothetical function that would give an Not necessarily. The transformative and sink functions on iterators could just be defined as methods on |
I was wrong, it’s not an interface. |
Why do some functions take the names := xiter.Map(func (p Person) string {
return p.Name
}, people) // "people" gets lost
// vs
names := xiter.Map(people, func (p Person) string {
return p.Name
}) |
@DeedleFake There won't be a "decision" on #49085 anytime soon. There are good reasons not to do it yet, but we also don't want to say it never happens. The issue exists to reflect that state. What it comes down to is, would you rather have no iterators (for the foreseeable future) or ones which can't be "chained"? |
No iterators, definitely. I've done fine without them for over a decade. I can wait a bit longer. If a bad implementation goes in, I'll never get a good version. Plus, I can just write my own implementation of whatever iterator functions I need as long as |
Neither chaining nor functional programming has ever been a decisive or recommended technique in Go. Instead, iteration—specifically, procedural 'for' loops—has always been a core technique since the language's inception. The iterator proposals aim to enhance this core approach. While I don't know what the overall plans are, if you're hoping for Go to follow the path of Java Streams or C# LINQ, you might be in for disappointment. |
I think "a bit" is misleading. We are talking years - if at all. And I don't believe the second part of that sentence is true either, we could always release a v2 of the relevant packages, if we ever manage to do #49085 in a decade or so. |
Is that not the intention of these proposals? To build a standardized iterator system that works similarly to those? Why else is there a proposal here for
Edit: The way this proposal is phrased does actually imply that they may be heavily reevaluated enough in That issue has only been open for 2 years. I think assuming that it'll take a decade to solve is a bit unfair. Yes, a One of my favorite things about Go is how slow and methodical it (usually) is in introducing new features. I think that the fact that it took over a decade to add generics is a good thing, and I really wanted generics. One of the purposes of that approach is to try avoid having to fix it later. Adding those functions in the proposed manner will almost definitely necessitate that later fix, and I very much would like to avoid that if at all possible. |
Java Streams and .NET LINQ build on a standardized iterator system, but they are more than that. Both languages had a generic iterator system before. Iterators are useful without chaining or functional programming.
That would be this very proposal, and it comes with a caveat: "... or perhaps not. There are concerns about how these would affect idiomatic Go code. " This means that not everyone who has read these proposals in advance believes that this part is a good idea. |
Maybe chaining leads to too much of a good thing. It becomes more tempting to write long, hard-to-read chains of functions. You're less likely to do that if you have to nest calls. As an analogy, Go has |
Re #49085, generic methods either require (A) dynamic code generation or (B) terrible speed or (C) hiding those methods from dynamic interface checks or (D) not doing them at all. We have chosen option (D). The issue remains open like so many suggestions people have made, but I don't see a realistic path forward where we choose A, B, or C, nor do I see a fifth option. So it makes sense to assume generic methods are not going to happen and do our work accordingly. |
@DeedleFake The issue is not lack of understanding what a lack of parameterized methods means. It's just that, as @rsc said, wanting them doesn't make them feasible. The issue only being 2 years old is deceptive. The underlying problem is actually as old as Go and one of the main reasons we didn't have generics for most of that. Which you should consider, when you say
We got generics by committing to keep implementation strategies open, thus avoiding the generics dilemma. Not having parametric methods is a pretty direct consequence of that decision. |
Well, I tried. If that's the decision then that's the decision. I'm disappointed, but I guess I'll just be satisfied with what I do like about the current proposal, even if it has, in my opinion, some fairly major problems. Sorry for dragging this a bit off-topic there. |
Hope that it's not noise: I wondered if naming it the |
Those nonstandard Zip definitions look like they would occasionally be useful but I think I'd want the ordinary zip/zipLongest definitions most of the time. Those can be recovered from the proposed with some postprocessing but I'd hate to have to always do that. These should be considered along with Limit: LimitFunc - stop iterating after a predicate matches (often called TakeWhile in other languages) Skip, SkipFunc - drop the first n items (or until the predicate matches) before yielding (opposite of Limit/LimitFunc, often called drop/dropWhile) |
Can you explain the difference? Is it just that |
zip stops after the shorter sequence. zipLongest pads out the missing values of the shorter sequence with a specified value. The provided ones are more general and can be used to build those but I can't really think of any time I've used zip where I needed to know that. I've always either known the lengths were equal by construction so it didn't matter or couldn't do anything other than drop the excess so it didn't matter. Maybe that's peculiar to me and the situations in which I reach for zip, but they've been defined like that in every language I can think I've used which has to be some kind of indicator that I'm not alone in this. I'm not arguing for them to be replaced with the less general more common versions: I want those versions here too so I can use them directly without having to write a shim to the standard definition. |
I think it would look like this: func ParseIntegers(seq iter.Seq[string], errp *error) iter.Seq[int] {
return func(yield func(int) bool) {
for s := range seq {
n, err := atoi.ParseInt(s)
if err != nil {
*errp = err
return
}
if !yield(n) {
return
}
}
}
} |
Yes, that works. I wouldn't want to write or use that. I also don't think it is wholly unnatural to continue processing, even if a single datum is broken - I think it makes sense to skip/log broken data and continue with the rest. Like, imagine a batch-processing pipeline, taking a sequence of filenames, doing some processing on each, and mapping it to an aggregate and error. But sure, if the consensus is that we just don't think people should do that kind of stuff and should always just stop iterating if any error is encountered, we wouldn't need |
@Merovius In addition to your options, I would propose another one. If we wanted to convert a function that operate on a value: func(v T) (R, error) into a function that operates on a sequence: func(s iter.Seq[T]) iter.Seq2[R, error] we could define a function as type Func[T, R any] func(v T) (R, error)
type SeqFunc[T, R any] func(s iter.Seq[T]) iter.Seq2[R, error]
func AsSeq[T, R any](fn Func[T, R]) SeqFunc[T, R] {
return func(s iter.Seq[T]) iter.Seq2[R, error] {
return func(yield func(R, error) bool) {
for v := range s {
v2, err := fn(v)
if !yield(v2, err) {
return
}
}
}
}
} With this function, func ParseIntegers(s iter.Seq[string]) iter.Seq2[int, error] {
return AsSeq(strconv.Atoi)(s)
} or even ParseIntegers := AsSeq(strconv.Atoi) |
Map12 seems entirely reasonable and generally useful. So does Map21, for that matter (Keys and Values can be implemented in terms of it). Add // First stops iteration and sets err on the first err in s
func First[K any](err *error, s iter.Seq2[K, error]) iter.Seq[K] And you can do var err error
for s := range xiter.First(&err, xiter.Map12(seq, strings.Atoi)) { |
If we want to log errors, then my first inclination would be to change the function we are passing in. Instead of passing func(s string) int {
r, err := strconv.Atoi(s)
if err != nil {
log.Errorf("bad number %s: %v", s, err)
}
return r
} If we think that will happen with some frequency then func LogErrors[T1, T2 any](f func(T1) (T2, error)) func(T1) T2 {
return func(v T1) T2 {
r, err := f(v)
if err != nil {
log.Errorf("failure on %v: %v", v, err)
}
}
} Or, of course, func LogAndSkipErrors[T1, T2 any](it iter.Seq[T1], f func(T1) (T2, error)) iter.Seq[T2] {
return func(yield func(T2) bool) {
for v := range it {
r, err := f(v)
if err != nil {
log.Errorf("failure on %v: %v", v, err)
continue
}
if !yield(r) {
return
}
}
}
} My general point is that we don't have to focus on "a sequence of values with errors". We can focus on "use a function to convert one value to another, while handling errors in some way". |
FWIW none of these other options really changes my point about this being awkward and requiring boilerplate. Anyways, I just thought I should bring it up. To me, the case of having a mapping-function that returns an But if the consensus is that this is not needed, okay. |
I'm not opposed to mapping functions that work with errors, but I think it would be premature to add them today. |
Thanks for the provocative example. The combination of
Both possible values of the error term I'm optimistic about ways to dress up
If I can provide a fallible sequence with an error handling function, I can isolate some expected errors and fail otherwise. |
I think it may be worth looking at this from a completely different angle: jq is a DSL where iterators are first-class citizens, with syntax designed to be ideal for iterator chaining. Go already has several other DSLs in the standard library ( A Go implementation can be found at https://github.com/itchyny/gojq . Its APIs are focused on |
Another reason not to name this package xiter is that since this was proposed, “xitter” has become a common nickname for Elon Musk’s social media site. |
In English 'x' is pronounced /ks/, not /ʃ/. I think there would have to be intent behind mispronouncing it. Also, 'liter' and 'litter' are two different words, and nobody would shudder at the thought of drinking a liter of water just because it's phonetically similar to 'litter'. |
An initial X is not common in English, but when it does occur, it is not pronounced as a KS sound like medial X. Xerox and xylophone are both Z sounds, which leads to the natural pronunciation "zitter". You also see Xi commonly called "Zee" by people who don't know Pinyin. |
Okay, but a lot has to go wrong to accidentally change a voiced lenis alveolar fricative like /z/ into a voiceless fortis postalveolar fricative like /ʃ/. The more probable mispronunciation is /s/, leading to the word "sitter," which doesn't carry a negative connotation. |
Why deal with any of these problems? Just get a name that can't be mispronounced. iterx is fine, for example. |
I have better idea, just lets dont add this "feature". Go promise simplicity and with new abstractions it brakes this promise. And with all respect this should be main point of discussion, not "naming". |
Do you mean functions as iterators or this package? I think functions as iterations are a big change to the language but mostly a positive one. It's much too complex to make or use an iterator now. As for this package, I think it probably goes too far in the direction of library calls for what could just be a series of statements, but people are going to demand it, so might as well have a single semi-canonical version of it in golang.org/x. |
To add to this, Wikipedia notes that jq is a pure functional language. When considering that jq's intended purpose is to query JSON, one comes to the realization that all query languages are pure functional languages (although obviously only when querying and not when updating: SQL It may just be that query languages are the ideal form of the functional programming paradigm: DSLs that are designed around easy composition of iterators, where side-effects don't exist due to being database read operations. In that case, studying other query languages and designing a custom one tailored for Go's type system should be more fruitful than trying to shoehorn functional programming constructs directly into Go, of which the many difficulties are already discussed above. It would also allow for far more advanced iterator composition techniques than can be accomplished natively in Go. For prior art on embedded query languages, see C#'s LINQ, which allows writing SQL-like expressions that, among other things, can be used to run |
Maybe a simply compiler magic is all we need. usually var buf *bytes.Buffer
buf.String() but we can var buf *bytes.Buffer
(*bytes.Buffer).String(buf) why not this. a similar syntax like template pipeline var buf *bytes.Buffer
buf \ (*bytes.Buffer).String() now we have nonNegative := bufio.Lines(r) \
xiter.Map(parseLine) \
xiter.Filter(func(v int) bool { return v >= 0 }) |
Would it be useful to do something like:
with |
Throwing in a particular shout for some form of Tee, because the simple naïve implementation is not as performant as a trickier implementation. As such it would be valuable to get it right, once, in the stdlib and optimise it there. (I've mentioned this before somewhere, but can't remember/find which discussion) |
A while ago, I wrote a nonsensical signature for a function that produced an iterator with possible errors:
I now know what I meant to write:
See this comment, where I argue that returning an error function along with an |
We propose to add a new package golang.org/x/exp/xiter that defines adapters on iterators. Perhaps these would one day be moved to the iter package or perhaps not. There are concerns about how these would affect idiomatic Go code. It seems worth defining them in x/exp to help that discussion along, and then we can decide whether they move anywhere else when we have more experience with them.
The package is called xiter to avoid a collision with the standard library iter (see proposal #61897). An alternative would be to have xiter define wrappers and type aliases for all the functions and types in the standard iter package, but the type aliases would depend on #46477, which is not yet implemented.
This is one of a collection of proposals updating the standard library for the new 'range over function' feature (#61405). It would only be accepted if that proposal is accepted. See #61897 for a list of related proposals.
/*
Package xiter implements basic adapters for composing iterator sequences:
*/
package xiter
Ideally we would define these:
but we may not be able to. If not, the definitions below would refer to iter.Seq and iter.Seq2 instead of Seq and Seq2.
The text was updated successfully, but these errors were encountered: