Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: slices: add Shift #64103

Open
pgaskin opened this issue Nov 13, 2023 · 14 comments
Open

proposal: slices: add Shift #64103

pgaskin opened this issue Nov 13, 2023 · 14 comments
Labels
Milestone

Comments

@pgaskin
Copy link

pgaskin commented Nov 13, 2023

I propose adding the following function to slices:

// Shift removes the first element from s, returning false if the slice is empty.
func Shift[S ~[]E, E any](s S) (first E, rest S, ok bool) {
	if len(s) == 0 {
		rest = s
	} else {
		first, rest, ok = s[0], s[1:], true
	}
	return
}

Like #64095, this is useful for parsing stuff.


Update: I think this is better without the ok return value.

@gopherbot gopherbot added this to the Proposal milestone Nov 13, 2023
@pgaskin
Copy link
Author

pgaskin commented Nov 13, 2023

example: parsing btrfs subvolume list
type BtrfsSubvolume struct {
	ID           int
	Gen          int
	CGen         int
	Parent       int
	TopLevel     int
	ParentUUID   string // of snapshots
	ReceivedUUID string // of received snapshots
	UUID         string
	Path         string
}

func btrfsSubvolumeList(path string, onlySnapshots, onlyReadonly, deleted bool) ([]BtrfsSubvolume, error) {
	var typeFilter []byte
	if onlySnapshots {
		typeFilter = append(typeFilter, 's')
	}
	if onlyReadonly {
		typeFilter = append(typeFilter, 'r')
	}
	if deleted {
		typeFilter = append(typeFilter, 'd')
	}

	var stdout, stderr bytes.Buffer
	cmd := exec.Command("btrfs", "subvolume", "list", "-opcguqR"+string(typeFilter), "--", path)
	cmd.Stdout = &stdout
	cmd.Stderr = &stderr
	if err := cmd.Run(); err != nil {
		if stderr.Len() != 0 {
			err = fmt.Errorf("%w (stderr: %q)", err, strings.TrimSpace(stderr.String()))
		}
		return nil, err
	}

	var result []BtrfsSubvolume
	sc := bufio.NewScanner(&stdout)
	for sc.Scan() {
		var v BtrfsSubvolume
		for f := strings.Fields(sc.Text()); len(f) != 0; {
			var ok bool
			var tmp string
			var err error
			switch tmp, f, _ = slicesShift(f); tmp {
			case "ID":
				tmp, f, _ = slicesShift(f)
				if v.ID, err = strconv.Atoi(tmp); err != nil {
					return nil, fmt.Errorf("parse output: id: %w", err)
				}
			case "gen":
				tmp, f, _ = slicesShift(f)
				if v.Gen, err = strconv.Atoi(tmp); err != nil {
					return nil, fmt.Errorf("parse output: gen: %w", err)
				}
			case "cgen":
				tmp, f, _ = slicesShift(f)
				if v.CGen, err = strconv.Atoi(tmp); err != nil {
					return nil, fmt.Errorf("parse output: cgen: %w", err)
				}
			case "parent":
				tmp, f, _ = slicesShift(f)
				if v.Parent, err = strconv.Atoi(tmp); err != nil {
					return nil, fmt.Errorf("parse output: parent: %w", err)
				}
			case "top":
				if f, ok = slicesCutPrefix(f, "level"); ok {
					tmp, f, _ = slicesShift(f)
					if v.TopLevel, err = strconv.Atoi(tmp); err != nil {
						return nil, fmt.Errorf("parse output: top level: %w", err)
					}
				}
			case "parent_uuid":
				if tmp, f, _ = slicesShift(f); tmp != "-" {
					v.ParentUUID = tmp
				}
			case "received_uuid":
				if tmp, f, _ = slicesShift(f); tmp != "-" {
					v.ReceivedUUID = tmp
				}
			case "uuid":
				if tmp, f, _ = slicesShift(f); tmp != "-" {
					v.UUID = tmp
				}
			case "path":
				v.Path, f, _ = slicesShift(f)
			default:
				// ignore unknown field
			}
		}
		if v.Path == "" {
			return nil, fmt.Errorf("parse output: path: missing")
		}
		result = append(result, v)
	}
	return result, nil
}

some general examples

@apparentlymart
Copy link

I like the idea of this -- I've hand-written inline variations of it a number of times.

Based on my previous similar uses of code like this, I guess I'd expect to use it in a loop something like this:

remain := thingThatProducesSlice()
for len(remain) != 0 {
    next, remain, _ := slices.Shift(remain)
    // Do something with "next"
}

In this case I don't really need the boolean result because remain being zero-length is a sufficient termination condition. However, I must admit I can't really think of a formulation without ok that I like better; the only viable alternative seems to be to panic if s is empty, and that seems worse than just having an ignorable final return value.


The name Shift confused me a little at first, until I realized it was the same meaning of "shift" as e.g. Perl's shift. Perhaps enough Go programmers are familiar with a language that uses that noun in this way that it's fine -- I had to dig it out of my long-term memory, personally 😀 -- but I think I might prefer a more direct name like TakeFirst, particularly since this function doesn't (and cannot) modify the given slice in-place to remove the first element.

(I did a little survey of languages with a shift function and found that it's out there more than I thought and I guess I've just somehow narrowly avoid learning about it in e.g. JavaScript, so maybe it's fine and I'm just an outlier.)

@mateusz834
Copy link
Member

mateusz834 commented Nov 13, 2023

@apparentlymart Your example is wrong, it should be:

remain := thingThatProducesSlice()
for len(remain) != 0 {
    var next int
    next, remain, _ = slices.Shift(remain)
}

I wonder whether it would be better to pass the slice as a pointer instead, to avoid returning it from the function.

@apparentlymart
Copy link

apparentlymart commented Nov 13, 2023

Heh thanks @mateusz834. Amusingly, now you've pointed it out I remember making exactly that same mistake in at least one of the hand-written previous implementations I was referring to, so apparently for me at least that's an attractive nuisance.

Making the function take a pointer and then overwrite it with a slice pointing one element further along does seem like it would address that problem, despite being a little unusual. I suppose it would make it a little like #64090, although voting doesn't seem favorable on that at the time I'm writing this. (Perhaps that's just because of how close to the builtin append it is rather than that it takes a pointer, but I'm not sure.)

Pointer-based version:

remain := thingThatProducesSlice()
for len(remain) != 0 {
    next, _ := slices.Shift(&remain)
    // ...
}

This now seems temptingly close in shape to a traditional "for clause", since it has all of the parts: an initializer, a predicate, and a "next element" assignment. That made me try for a version without the ok return value where all of the machinery could live on the for line, but I didn't yet find one that was pleasing to me. Perhaps others will have better ideas!

@pgaskin
Copy link
Author

pgaskin commented Nov 13, 2023

I don't think it fits with the general convention to pass it as a pointer, although I do see the appeal in doing so.

If you want a "traditional for clause", maybe something like this would work (while still allowing you to consume items within the loop):

parts := thingThatProducesSlice()
for next, remain, ok := slices.Shift(parts); ok; next, remain, ok = slices.Shift(remain) {
    // ....
}

@pgaskin
Copy link
Author

pgaskin commented Nov 13, 2023

I'd also like to point out that the main appeal for using this is in inline if statements like the following (it's just as simple to use next, remain = remain[0], remain[1:] in your for len(remain) != 0 loop example), since it eliminates having a separate, and possibly error-prone length check beforehand:

var next string
if next, rest, _ = slices.Shift(rest); next != "" {
    // ...
}

Or in a switch statement:

switch next, rest, _ := slices.Shift(rest); next {
    case something:
        // ...
    case whatever:
        // ...
    default:
        // ...
}

Or for taking the next value or a zero value if none is provided:

whatever.Field, rest, _ := slices.Shift(rest)

These three examples are cases where having a length check is error-prone and make the code more complex and less readable.

@apparentlymart
Copy link

So far it's seemed like all the examples we've been sharing with each other have just totally ignored the boolean ok result and either tested whether the slice is empty before calling it or have tested the first return value against some predicate.

Again I find myself unsure what to make of that, since it also doesn't seem right for it to just panic if someone calls it with an empty slice, but if that ok bool argument is going to be totally ignored most of the time it gives me a sense that this API design isn't quite right. 🤔

Does anyone have an example of a usage pattern that does rely on the ok bool result? (I'm intentionally disregarding the three-clause-for example above since I understood that as an argument against using three-clause-for in this case, but if I misunderstood please let me know.)

@earthboundkid
Copy link
Contributor

Given that ok is always the same as ok := len(s) > 0, maybe it can just be dropped from the return, and users who want it can just test themselves.

@pgaskin
Copy link
Author

pgaskin commented Nov 15, 2023

maybe it can just be dropped from the return

You're probably right. I added it to match how most of the Go functions like this work, but looking back at my own uses of this, there's almost no cases where I would actually use ok rather than just handling the zero value.

I guess the point of Shift is primarily to shorten the cases where you need the first element, but you don't really care if it exists -- only if the value is something you're looking for.

@adonovan
Copy link
Member

adonovan commented Dec 22, 2023

Seems reasonable. TakeFirst would be a more self-descriptive name.

[Edit: see my next note.]

@earthboundkid
Copy link
Contributor

This seems good to me:

// TakeFirst returns the first element of s and s[1:] 
// if len(s) > 0, otherwise it returns the zero value of E
// and s unchanged.
func TakeFirst[S ~[]E, E any](s S) (first E, rest S) {
	if len(s) == 0 {
		rest = s
		return
	} 
	return s[0], s[1:]
}

@adonovan
Copy link
Member

adonovan commented Dec 22, 2023

Hmm, let's see what it looks like in practice for a queue:

var elem T // sad
for len(q) > 0 {
    elem, q = TakeFirst(q) // worse than: elem, q = q[0], q[1:]
    use(elem)
}

Or:

for len(q) > 0 {
    head, tail := TakeFirst(q) // worse than: head, tail := q[0], q[1:]
    use(head)
    q = tail // sad
}

Or:

for elem, q := TakeFirst(q); elem != *new(T); elem, q = TakeFirst(q) { // makes assumptions about the zero value of elem
    use(elem)
}

In none of these cases is it as simple or efficient as the straightforward Go code.

For the len=0 logic to be of any benefit we would need a variant such as this ugly thing:

for {
    elem, ok := TakeFirst(&q)
    if !ok {
        break
    }
    use(elem)
}

I retract my enthusiasm.

@cespare
Copy link
Contributor

cespare commented Dec 23, 2023

Yeah, to my mind the only version that's even worth considering is the pointer-based version. I prefer it to just panic on empty slice:

for len(q) > 0 {
	elem := TakeFirst(&q)
	use(elem)
}

I use a helper like this in advent of code where I write little queue-processing functions and whatnot all the time. In normal/professional code, where it comes up a bit less frequently, the "plain Go" version

for len(q) > 0 {
	elem := q[0]
	q = q[1:]
	use(elem)
}

is only a tiny bit more verbose and seems fine (better than all the slices.Shift examples in this thread IMO).

So overall this seems very marginal.

@earthboundkid
Copy link
Contributor

I retract my support for TakeFirst. That seems persuasive that it's not a net savings. Oh well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

7 participants