Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: strings: function to check if a string contains all specified runes #63328

Closed
knbr13 opened this issue Oct 2, 2023 · 8 comments
Closed
Labels
Milestone

Comments

@knbr13
Copy link

knbr13 commented Oct 2, 2023

Use Case

Sometimes I encounter scenarios where I need to check if a given string contains all of a specific set of runes, such as vowels ('a', 'e', 'i', 'o', 'u'). Currently, I have to iterate through the string and runes manually.

Proposal
I propose adding a function to the strings package that can efficiently check if a string contains all of the specified runes. For example:

func containsAllRunes(s string, runes []rune) bool {
	for _, r := range runes {
		if ContainsRune(s, r) {
			return false
		}
	}
	return true
}

- Note that I didn't prefixed ContainsRune function with 'strings' which is the name of the package, because this function is intended to be in the strings package

- The implementation provided above is a basic example for illustration purposes.The actual implementation can be optimized for better performance and efficiency.

This function would greatly simplify such checks and improve code readability.
usage example:

func main() {
	s := "hello world from Go!"
	b := containsAllRunes(s, []rune("abcde"))
	fmt.Println("the string contains all the provided runes:", b)
}
@knbr13 knbr13 added the Proposal label Oct 2, 2023
@seankhliao
Copy link
Member

seankhliao commented Oct 2, 2023

how common is this that it should be in the standard library?
can you point to some existing code that would be simpler if this function were available?

@seankhliao seankhliao changed the title strings package: Add function to check if a string contains all specified runes proposal: strings: function to check if a string contains all specified runes Oct 2, 2023
@gopherbot gopherbot added this to the Proposal milestone Oct 2, 2023
@seankhliao seankhliao added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Oct 2, 2023
@odeke-em
Copy link
Member

odeke-em commented Oct 3, 2023

Thank you for filing this issue @abdullah-alaadine and welcome to the Go project. Thanks @seankhliao for the reply!

@abdullah-alaadine if I may, I believe that for correctness you wanted to use an inversion of the conditions so if !strings.ContainsRune(s, r) and not if strings.ContainsRune(s, r) so as to be

package main

import (
	"strings"
)

func containsAll(s string, wants []rune) bool {
	for _, r := range wants {
		if !strings.ContainsRune(s, r) {
			return false
		}
	}
	return true
}

func main() {
	println(containsAll("hello world from Go!", []rune("eld")))
}

but purely academically, that code above is expensive in the worst case which could become a quadratic time complexity lookup O(l*k) where l = len(s), k = len(wants) though in practice it is very fast and doesn't need a change in the Go standard library as that's most likely what everyone will default to for small strings

Improved implementation

For large strings or larger slices of want, you could use strings.ContainsFunc which got introduced in Go1.20 to iterate exactly once over wants and store the values in a map that'll be used later on to check if a value exists in s so giving a linear time lookup of O(n) where n = max(l, k) per https://go.dev/play/p/ZyN8JEFhLzJ or

package main

import (
	"strings"
)

func containsAll(s string, wants []rune) bool {
	m := make(map[rune]bool, len(wants))
	// iterates over all the runes in wants exactly once: O(l)
	for _, r := range wants {
		m[r] = false
	}

	// Iterates over all the string's runes exactly once: O(k)
	strings.ContainsFunc(s, func(r rune) bool {
		if _, inWants := m[r]; inWants {
			// Constants time lookups & insertions: O(1)
			m[r] = true
		}
		return false // Return false to iterate over all runes in s.
	})

	for _, wasSeen := range m {
		if !wasSeen {
			return false
		}
	}
	return true
}

func main() {
	println(containsAll("hello world from Go!", []rune("elfd")))
	println(containsAll("hello world from Go!", []rune("elfdz")))
}

Academically my above suggestion seems cheaper but in practice for small slices, iterating over 2 slices is very fast and cheaper

Proposal motivations and demand for this code

Now given that a more efficient implementation had to be scrutinized by myself and it requires more thought for larger values, perhaps that's good motivation for the standard library to take the logic away from everyone having to implement it, however we need data how much code out there has to look for all runes, how often is this pattern required, can you please compile usages of the need for this code? @rsc got a Go corpus compiled and also Sourcegraph allows you to search over public Go repositories https://sourcegraph.com/search /cc @sqs @beyang

@knbr13
Copy link
Author

knbr13 commented Oct 3, 2023

mm, yes I think that the real-world use cases for this function are relatively rare, for me, I needed this helper function in this LeetCode problem "316. Remove Duplicate Letters", where I want to check if a substring of the whole string contains a set of specific characters, but I can solve the problem in a simpler and more efficient way, and I don't have specific public examples to point to, but I've thought of some use cases that this helper function is useful, like if I am building a game, where users need to press specific keys a certain number of times, and the order of the input doesn't matter (which it's not usually case in games).
I think that the real-world use cases for such helper are actually rare, but I'll think more about it to see if there's a real need for it.

Anyway, I'll add the updated implementation of this function to make it work as expected. (The provided one above is just to explain the purpose of it, it won't work as expected in some cases).

function version 2:

func containsAll(s string, wants []rune) bool {
	m := map[rune]int{}
	for _, r := range s {
		if _, ok := m[r]; !ok {
			m[r] = 1
		} else {
			m[r]++
		}
	}
	for _, r := range wants {
		if _, ok := m[r]; !ok {
			return false
		} else if m[r] == 0 {
			return false
		}
		m[r]--
	}
	return true
}

The problem with this implementation is that if the developer need to check if x number of a specific rune is in the string, like
str: 'hello world from Go!' runes: the string must contain 5 'a', 3 'b' and 1 'c', then the developer have to write them manually:
res := containsAll("hello world from Go!", []runes("aaaaabbbc")
this will become a problem as the number of occurrences needed increases,
this problem will be addressed in function version 3.

function version 3:

func containsAll(s string, wants map[rune]int) bool {
	m := map[rune]int{}
	for _, r := range s {
		if _, ok := m[r]; !ok {
			m[r] = 1
		} else {
			m[r]++
		}
	}
	for k, v := range wants {
		if _, ok := m[k]; !ok && v > 0 {
			return false
		} else if m[k] < v {
			return false
		}
		m[k] -= v
	}
	return true
}

example on using it:

func main() {
	str := "hello world mr Go!"
	res := containsAll(str, map[rune]int{
		'r': 2,
		'o': 3,
		'h': 1,
		'!': 1,
	})
	fmt.Println("contains all runes:", res)
}

@knbr13
Copy link
Author

knbr13 commented Oct 3, 2023

Thank you, @odeke-em, for your notes. Yes, you are right; I missed the '!' unintentionally. I'll search to see if there are a decent number of use cases for this helper function to determine whether it's worth adding to the standard library or not.

@odeke-em
Copy link
Member

odeke-em commented Oct 4, 2023

Gotcha and thank you for giving me the context for your need @abdullah-alaadine! I just looked at your problem and I believe you can definitely solve it in much simpler, more efficient and faster ways without having to use a map, you can just use a slice of 26 booleans values as an index, given the constraints:

  • Input is lowercase English letters: a to z so 26 letters
  • Deduplicated values
  • Lexicographic order of outputs

so https://go.dev/play/p/ecMWraalU5J

package main

import "fmt"

func dedupLettersAndLexicographicOrder(s string) string {
	// Constraint, lower case English letters in s.
	// Output: Deduplicated letters in lexicographic order.
	index := make([]bool, 26)
	for _, r := range s {
		if i := r - 'a'; index[i] == false {
			index[i] = true
		}
	}

	// Finally return the result:
	res := make([]rune, 0, len(index))
	for i, ok := range index {
		if ok {
			res = append(res, rune(i+'a'))
		}
	}

	return string(res)
}

func main() {
	rubric := map[string]string{
		"aaaaaaabdeeeecfff": "abcdef",
		"bbbefffghaaaaaaa":  "abefgh",
	}

	for input, want := range rubric {
		got := dedupLettersAndLexicographicOrder(input)
		if got != want {
			fmt.Printf("%q =>\n\tGot:  %q\n\tWant: %q", input, got, want)
		}
	}
}

I think that once you get good data on this then the proposal can proceed.

@knbr13
Copy link
Author

knbr13 commented Oct 4, 2023

Thanks for sharing your solution. @odeke-em
It's indeed simpler and more efficient.
But actually your solution will not pass the test cases on LeetCode, because there's one more constraint you missed (the problem description is bad), which is that you are not allowed to rearrange the characters in the string, I'll copy and paste the examples given in the problem descrition:

Example 1:

Input: s = "bcabc"
Output: "abc"

Example 2:

Input: s = "cbacdcbc"
Output: "acdb"

Example 1 is clear, but in example 2, as you can see, the output is acdb not abcd.
In ex1 there are 3 possible strings: bca, cab, and abc.
but in ex2 there is no abcd string without rearrangement.

@seankhliao seankhliao added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Oct 7, 2023
@knbr13
Copy link
Author

knbr13 commented Oct 22, 2023

after spending sometime on searching for real-world use cases for such function, where the developer has to use this function, I didn't find any real-world scenario for it, so I think it's better not adding it to the standard library. I'd like to hear back from you about your take on it, and thanks!

@seankhliao seankhliao removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Dec 8, 2023
@adonovan
Copy link
Member

after spending sometime on searching for real-world use cases for such function, where the developer has to use this function, I didn't find any real-world scenario for it, so I think it's better not adding it to the standard library. I'd like to hear back from you about your take on it, and thanks!

I think you're probably right; I don't think I've ever needed this function except when solving word puzzles.

@odeke-em odeke-em closed this as not planned Won't fix, can't repro, duplicate, stale Jan 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

5 participants