-
Notifications
You must be signed in to change notification settings - Fork 18k
x/text/secure/precis: Comparison API #17386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm confused about how this would work. Can you elaborate on how this solves the "I have four nicknames" problem? Similarly, I'm confused about what a CanonicalTransformer would do. Is it because preparation and enforcement are separable steps? (Sorry, I'm not very familiar with PRECIS)?
I don't have a complete answer, but one API possibility is to have
so that you can't compare a regular and a canonical string via ==.
I don't have any experience here, but it may be a non-issue. Has performance been a concern in practice or only in theory? If it does turn out to be a problem in practice, we can always add the CanonicalXxx methods (and types?) later. |
Not preparation and enforcement (we got rid of preparation, and there's been a little talk about getting rid of it in a future version of the RFC too), but enforcement and comparison (the form you have to transform the string into to see if two strings are equivalant). Because there is a specific order of operations to the steps we can't just tack any extra comparison steps (eg. to lowercase before a case insensitive comparison) on to the already enforced text, we have to start all over from the original input and execute the transformations over again. Right now the only output from our comparison function is true or false, I'm proposing giving us a way to access the data from the comparison transformation: // Creates output useful for comparison
t := precis.Nickname.NewCompareTransformer()
a := "Compare me"
b := "Compare Me"
a = t.String(a) // We only have to do this transformation once, then we can compare with == as many times as we want against as many strings as we want without having to transform "a" again
b = t.String(b)
// a == b == "compare me"
// Create output useful for enforcement
t = precis.Nickname.NewTransformer()
a = "Compare me"
a = t.String(a)
// a == "Compare me"
That's an interesting idea; have a special type just for comparison, keying on in maps, etc. I wonder if it would end up being confusing? I'm really not sure. Does it buy us anything over effectively just duplicating the current APIs and having separate methods for enforcement and comparison? Definitely worth thinking about. |
In the collate package there is a similar problem. I'm not saying we copy the solution there, as I'm not entirely happy with it, but we could copy the terminology, where it is called Key. |
"Key" is definitely simpler and makes more sense than "canonical/non-canonical" to me. It probably makes sense to match the collate package's API at first; I don't think that would stop us from adding on a transformer like method later if we wanted, and we'll probably want the simple "just do this operation on bytes or a string" functions regardless of what other APIs eventually get added (if any). |
CL https://golang.org/cl/30952 mentions this issue. |
I'm not saying that it's a good idea, but before I forget, another API possibility is one or both of:
depending on whether users need && or || semantics. (Again, I'm not overly familiar with PRECIS). |
I'm planning to use this to create the casefolded (or key or 'canonical') version of strings because I use the casefolded version of their name as an identifier. Another point which I can't shirk is that I need to compare one user's name with hundreds/thousands of other names very often (at the moment accomplished by looking it up as part of a map). Because of this, I'd very much enjoy being able to get the canonical/key/casefolded version of a given string somehow, and this would make me very happy indeed. Thanks for your work on this cool language :) |
Yah, I think whatever the final API looks like, it needs to have a way to actually get the key. However, the Thanks for your input all. |
CL https://golang.org/cl/30981 mentions this issue. |
I'm not a fan of the CompareAll/CompareAny API.
|
That's a fair point; maybe it's worth getting rid of Compare the method entirely then and only having the APIs that return keys for comparison?
I was more thinking about comparing the same initial value against lots of things, not the same set multiple times. |
For the sake of keeping history and discussion here, I wanted to comment that I've changed the API in one of my CLs at @mpvl's suggestion: func (p *Profile) AppendCompareKey(dst, src []byte) ([]byte, error) {}
func (p *Profile) CompareKey(s string) (string, error) {} |
@SamWhited: I think it is fine to keep Compare. It is the easiest to use if you don't care about performance. |
Currently, the PRECIS comparison function is implemented as a method on a profile:
This is a pretty straight forward API, but it means we have to perform enforcement (with any additional canonical form mappings) every time we want to compare two strings. If we want to compare, say, a nickname against three other nicknames, we have to do 6 transformations (the first nickname three times and the three we're comparing it against once each). It would be useful to have an API to do comparison that required us to transform each string we want to compare only once (4 transformations in the prevoius scenario).
I see a few options:
enforce
internally or perform some other caching (eg. cache all steps up to where enforcement and comparison diverge, then when the user calls the comparison method only execute the last few steps, possibly caching that output too)Internally caching enforcement operations would potentially make the package less useful on resource constrained environments (not sure if we care or not), and is potentially a lot of work. Since this package isn't for a specific use case, it may be hard to create an effective caching strategy.
Adding some functions that mirror the enforcement API make sense:
but this may be confusing (since the user will have to manually remember what transformation to apply in each situation, or whether or not they'd already mapped something to its canonical form before they do comparison, etc.)
Allowing them to keep a single, separate "comparison" profile may help this, but I'm not sure if it's all that much better to have to think about two separate profiles and when to use them:
Other suggestions appreciated; or maybe this is a non-issue that doesn't really matter given how fast the transformations in the x/text packages are?
/cc @mpvl @nigeltao
The text was updated successfully, but these errors were encountered: