New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: crypto/rand: add String #67057
Comments
Is there a reason to set the entropy to a hidden value rather than provide it as a parameter (other than Fillipo knows best)? What if I need more randomness tomorrow? The idea is cool but it seems too rigid to me, but I am not a security maven. |
I started with taking an int parameter for the entropy, then started going back and forth on whether it should be the length of the returned string (what if the caller does the math wrong, or does it right but later changes the alphabet to be smaller thinking the parameter is the entropy and doesn't need to change?) or the bits of entropy (what if the caller thinks it's the length of the string and passes a way too small number?). This is one of those cases where we can do the work for the user and do the secure thing directly. So yeah, we can know what the best answer is, so we can save callers the work and risk. If a caller knows they are fine with less entropy (e.g. for a PAKE or 2FA token), they can slice the string. The performance overhead is unlikely to be a bottleneck, and the slicing can alert a security reviewer to pay attention. There's no need for more entropy than 128 bits against brute force attacks. If making hundreds of thousands of billions of strings users might need more entropy against collisions, but they will probably have someone on the team that knows that by the time they design such a system, and they can just call |
I find the String() function with a parameter strange because the String() method doesn't usually have parameters. The approach I usually apply is to use crypto/rand.Reader to get 16 random bytes and converting it with base64.RawURLEncoding.EncodeToString(). The resulting string reflects all the bits in the random value. Using this approach a RandomStringer with a String() method is quite easy to implement. crypto/rand could have a String() function that uses an internal default RandomStringer value. IMHO this approach would be more efficient than a String function that has to validate the alphabet parameter at every call. |
Would String("aab") exhibit a bias? |
@ulikunitz Efficiency is not a primary concern here, I don't think applications generate passwords or tokens in a hot loop, at least not hot enough that checking that a string is valid UTF-8 will matter.
Yes, |
Is This API can be misused to generate always the same string or biased strings. The following API doesn't have these weaknesses:
The alphabet will produce slightly larger strings, but according to Crockford is a good compromise between compactness and error resistance. I named the method RandomString() to be explicit about what the method does. The String() function is ok, because it will be used as rand.String() in almost all cases. I assume that the String function will not support Unicode combining characters. Some people might want to use the function to create keys for databases. They might be concerned about performance. |
I like the proposal, but I'm also concerned about the alphabet just being a plain string. Inevitably somebody will read the alphabet in from a config file or basically anything other than |
Please refer to the proposed function's documentation's last paragraph (where "two" is sensibly read as "two different"): // (...)
// The alphabet is interpreted as a sequence of runes, and must contain at least
// two Unicode characters, or String will panic.
func String(alphabet string) string I urge you not to dilute the expressive power of the standard library by inventing novel concepts (such as |
I apologize for overseeing the panic condition. I suggest however to check for the runes in the string to be unique or as my math teacher used to say pairwise different. I did an experimental implementation and testing for uniqueness is not dramatically slower than testing for at least two different runes for short alphabets. And with 2 ms per call an Alphabet type is not required.
String1 tests for at least two different runes. String2 checks for a uniqueness. The implementation can be found here: https://github.com/ulikunitz/randstr/blob/main/string.go (Updated the comment, had a problem with the computation of the number of runes required.) |
After review I observed that I needed more than 128 random bits to ensure that the last character is selected from the full alphabet. The benchmark results are now:
|
Previous discussion #53447 |
I go back and forth on the name String. I wonder if 'Text' is better.
somehow seems clearer than
I agree that 128 bits of entropy is the right amount and unlikely to need to change. Is the idea that the implementation will always just copy the alphabet into a []rune (or as an optimization maybe a []byte) and then index the right number of times, and then discard that copy? |
I like rand.Text but I wonder if I would think it's more like Lorem Ipsum reading the docs the first time.
Correct. As for alphabet validation, if we wanted to be strict, we could disallow repeated characters, as well as Unicode joiner characters, to avoid someone putting in a multi-rune character and being surprised when they are selected separately. I am a little worried about applications letting attacker-controlled alphabets in and causing panics. A solution would be taking a page out of safeweb, and defining a private string type, so that only string constants and literals are allowed.
Not sure why applications would take alphabets as a parameter, and we could just add a line to the docs recommending against it. |
This proposal has been added to the active column of the proposals project |
Same as in #66821 (comment), |
Will the function ensure that every character of the returned string will be selected from the whole alphabet? If yes than in some cases more than 128 bits of entropy will be required. For example an alphabet of 32 characters will require 26*5 = 130 bits. It should be a requirement to ensure that all same-length substrings of the generated string represent the same amount of entropy. |
I am probably bikeshedding, but I would welcome a name akin to |
If there would be a digit alphabet-based number formatter (say strconv.Format(rand.Int64(), alphabet) + strconv.Format(rand.Int64(), alphabet) |
Random strings are useful as passwords, bearer tokens, and 2FA codes.
Generating them without bias from crypto/rand is not trivial at all, and applications are lured into using math/rand for it. See greenpau/caddy-security#265 for example.
I propose we add a top-level function that takes a charset and returns a string of elements randomly selected from it.
The length of the string is selected automatically to provide 128 bits of security. If uppercase and lowercase letters and digits are used, a string will be
ceil(log62(2^128))
= 22 characters long.If more than 2^48 strings are generated, the chance of collision becomes higher than 2^32, but that's also true of UUID v4. Callers that reach those scales could call String twice and concatenate the results, but it doesn't feel worth documenting.
There is no error return value on the assumption that #66821 is accepted. That allows convenient slicing if necessary.
This can't really be implemented in constant time, but since it always runs randomly, attackers can get only a single timing sample, which limits the maximum theoretical leak to a few bits.
Do we already have constants for the most common charsets defined somewhere?
/cc @golang/security @golang/proposal-review
The text was updated successfully, but these errors were encountered: