New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/text: add precis implementation #13336
Comments
/cc @mpvl for thoughts |
Thanks for the proposal. I'm fully in favor of adding this functionality to the text repo. That said, I have a few remarks and suggestions on the current design:
|
FWIW, my first hunch at an API that is in line with the rest of the text repo would be the below. A bit elaborate, but it allows for extensive optimization as most used transformers are stateless.
This is a quick sketch, so by no means saying this is how it should be. package secure type Profile struct{} func NewIdentifier(opts …Option) (_Profile, error) var Identifier *Profile // principle methods // wrapper methods for Transformer equivalents, allowing optimization and simpler API when transformer is stateless (most cases). type Option func(*options) func Norm(f norm.Form) Option type Transformer |
@mpvl Thanks for the feedback.
Fair enough; I'm not sure that anyone who doesn't know what this is would use it / apply it properly anyways though, and if they know what it is they'll (probably?) look for it under this name.
That's not bad, although that might give people a false impression about what the package should be used for; some classes (FreeForm) are much more about expressiveness than security. I'd hate people to get the wrong impression from the package name (or maybe I'm overthinking it). Either name is fine by me, however; I don't have any alternative suggestions. At least while I'm developing it separately I'll keep it as
Absolutely; that's actually on my radar. The current work has all been into generating the tables, and the
Will do; I wouldn't have wanted to use an internal package unless I was working on this with the aim of contributing it to the text package, but now that that's the goal I'll look into it.
Agreed. Thanks for all your feedback. I'll start working on a proper API for the package this weekend. |
Out of curiosity, where is the appropriate place to discuss the package / design? Do you all tend to use the proposal, or the mailling list? |
Update: I've switched the implementation around to using the Also, that API makes good sense to me and I've started implementing it. I'm not sure if it makes more sense to use the methods that @mpvl mentioned, which are consistent with a lot of names in other func (p *Profile) Enforce(s string) (string, error) {}
func (p *Profile) String(s string) (string, error) {}
func (p *Profile) PrepareString(s string) bool {}
func (p *Profile) ValidString(s string) bool {} // This was error above, but chances are we just want to check it. Maybe a separate API to get a reason? I'll think about it. |
Neat. Note that you can also use x/text/internal/ucd to further simplify the code. It parses UCD files for you. I think It makes more sense for you to submit directly in golang.org/x/text directly so we can make use of the Gerrit code review etc. Just specify in the package comments that the package is under full development and that the API may change. As for the package name: the text repo has mostly adopted the convention of naming packages after functionality instead of standards. Noticeable exceptions are the packages for handling data formats (cldr, ucd). I agree secure.FreeForm is a bit confusing/misleading. Let's stick with precis for now and decide before the "under construction label" is removed. (Also precis is actually a word as well, related to text nonetheless, so this may be even more confusing). As for naming, I find Enforce and PrepareString somewhat misleading names myself. Also, it is more likely that the reader of code understands Go convention than PRECIS convention. It should be clear in the documentation that the Go names are related to respective PRECIS names. Also, I'm not entirely agreeing with the standard. Valid/Prepare should not just return false if a string contains characters disallowed by the underlying Class, but also, in this case, for characters disallowed by a profile or even a transformation associated with a profile (normalization, case/width folding, etc. always succeed, but enforcing single script might not). So not following PRECIS convention with naming has the additional benefit that a PRECIS expert will read the documentation to see what the method does exactly. So using Go names is a double win. :) As for Valid returning a bool: I think the error would be useful, but it is indeed more consistent to get a bool. Note that the user could always get the error from the Transform, although this would not be as efficient. |
That convinces me; thanks.
Wilco. |
I subtree merged my changes into the For now, I've moved my repo to https://bitbucket.org/SamWhited/golang-x-text (which is now a full mirror of the text package) and will try to remember to keep it updated in case anyone wants to see the progress. |
Using Gerrit will allow me to give feedback on the code early. It will help catch issues early and avoids rewrites. But either way. |
@mpvl Are there instructions for how patches should be submitted online anywhere that don't require you to download some extra tool? Should I just mail a patch to golang-codereviews manually? |
@SamWhited Sorry, we can only accept patches through the Gerrit interface. Please do not send a manual patch. Docs for contributing via Gerrit are at https://golang.org/doc/contribute.html . |
@ianlancetaylor So there's no way to do that w/o installing git-codereview? I couldn't find anything in the Gerrit interface about it. |
You don't "download" some extra tool. You "go get" it. That should be a low bar for anybody interested in contributing to Go. |
Ah, okay, sorry, not interested in contributing then. I'll keep hacking on my package as part of the x/text package and juse use a local version, and you're welcome to take it or leave it if you want. Thanks anyways. |
I think we ultimately need to include this. It will have to go through Gerrit. Even if this may not happen now, we will need this functionality in the future. Reopening so we can keep track of this. |
Sorry, I should have followed up: Someone else informed me that it IS possible to submit to Gerrit manually without installing the extra dev tools. I will finish the package / do this at some point. Right now I'm trying to stop being lazy and rewrite the API (the options API didn't work out, because I was forgetting that the steps might be applied in different orders / at different stages, preparation, enforcement, etc. for different profiles). Perhapse this weekend I'll rework it a bit and finish it up. |
Went ahead and pushed the initial changes as requested, even though the API probably needs a rethink: https://go-review.googlesource.com/17933 |
Thanks for the review! I've fixed a number of things from the review (still haven't redesigned the API, so most of the things that weren't fixed were things that would be removed anyways) and want to update; what's the process for updating on Gerrit? The relavant section in the docs doesn't appear to have normal Git workflow instructions. If I push up a new detatched patch-set will it create a new review, or is it smart enough to update my old one? This process is insane; I'd forgotten why I stopped using Gerrit years ago, they seem to have no clue how Git is supposed to work. EDIT: Nevermind, figured out that that's what the change ID is for and updated the old commit. This is still insane, why in the world would Gerrit do this instead of just using a git ref and letting you update that (a branch or a tag)? I've now added an explicit class struct that implements the runes.Set interface. This gives us a Contains(rune) method for the classes. Will work on the profile API next. I have submitted an errata to the RFC for the language that got me confused and made me think the current API would work in the first place. |
On Sat, Dec 19, 2015 at 11:02 PM, Sam Whited notifications@github.com
|
Someone told me that about using CVS recently too. If you're going to use Git, why not actually use Git? Anyways, didn't mean to start a flame war, I know I can't change it (and, sadly, don't have a better recommendation anyways), but it's still ridiculous. |
I don't seem to be able to put followup comments on Gerrit, so I'll bring this back here again: RE the API for defining what steps need to be performed in prepare: It looks like we won't be able to just always perform width mapping as part of prepare (this is not standards compliant, and I have identified cases where it will break compatibility with PRECIS implementations in other languages, which rather defeats the purpose of using a framework like PRECIS if your clients/servers/etc. can't interop). So, I see a few options for the API that might be okay: Option 1: Define a pre-prepare option as a functionPrePrepare(f func(s string) (string, error)) Option Option 2: Define a pre-prepare option as a list of transformers(this is what I was doing before) PrePrepare(t ...transform.Transformer) Option |
For now, I've also implemented the Enforce method as far as it can go, and fixed the Allowed method to show why it's different from the class level Contains method. I'm thinking that maybe allowed and disallowed should be removed in favor of just having profile's be Set's too. Next thing on my radar is more tests; a few other implementors and I are currently discussing writing a standard set of test vectors and submitting them as an RFC as well; if anyone is interested, see the PRECIS WG mailing list. This is probably ready for another round of review. EDIT: Yah, not sure why I did EDIT 2: I'm also starting to wonder why we have an |
Allowed/Disallowed: IIUC, the set of allowed runes is not inverse of the set of disallowed runes. That's why probably. |
It sounds like the proposal - the idea of adding precis support - has been accepted. Now this issue seems to be serving as code review. @SamWhited, why not reply on Gerrit? |
Re your comment 4 days ago, @SamWhited as long as you put a Commit-Id line in each commit, then you can push as many commits as you want to Gerrit. They all turn into separate reviews. Honestly the simplest way to do this is 'go get golang.org/x/review/git-codereview' and then run 'git-codereview hooks' in your git repo. That will install a commit-msg hook that adds the Commit-Id line for you. You don't need to use any of the rest of the workflow if you want - raw git commands are just fine - but I imagine you'd get tired of making up random Commit-Id lines at some point. |
@rsc I'll double check, I tried to push several commits (with commit ID's) and I think it just told me that I had to squash first. If not, that would be far better! (although I still don't understand the need for commit ID's when Git already has commit ID's, and patch sets [aka branches/tags/any other ref], etc.) Using the pre-commit hook sounds good; thanks. I'll start making separate reviews after the major body of the work is reviewed / merged. RE: replying on Gerrit, I think I did reply to everything, no? I just wanted to leave updates here to keep this issue up to date; if these issues are just meant for proposals and not to track the work though, I can stop. I'm still figuring out the workflow. EDIT: Ah, just saw your thread on go-dev talking about this and saying you'd update the contribution guidelines. Many thanks! |
@rsc just tried the pre-commit hooks you mentioned. With them, it won't even let me commit on my own branch.
|
@SamWhited, depending on your setup that check may not be appropriate. (It is for the typical workflow we use.) If you zero the file .git/hooks/pre-commit then that check won't run anymore. |
Ah, many thanks; I guess I should have gone and actually looked at what the tool installed; didn't realize they were separate hooks (though in retrospect I suppose they'd have to be). |
@mpvl ping; any chance for another review soon (I know you're probably busy what with the upcomming 1.6 release, just wanted to keep this issue on your radar)? |
Fixed the issues from the last review, but was having some trouble with Gerrit's single-commit / commit-id bullshit; if anything looks off please let me know and I'll close that CL and open a new one. |
Sorry for the delay. It is a bit tricky with that many patch sets, but doable. So no need to do Note that it is perfectly fine to locally accumulate commits (e.g. using I have a few minor comments left. On Fri, Jan 15, 2016 at 5:49 PM, Sam Whited notifications@github.com
|
Many thanks, those are now fixed. Once this is merged I'll add a new CL for the profiles that we took out and try to keep future CL's a bit smaller. |
Merged: https://godoc.org/golang.org/x/text/unicode/precis EDIT: I don't appear to be able to update the milestone, if someone could do that I'd appreciate it. /cc @mpvl @rsc |
As for someone unfamiliar with the release schedule and such, when it will be available? Right now the link https://godoc.org/golang.org/x/text/unicode/precis says not found. |
The package is available, but has been moved to x/text/secure/precis. On Sat, Feb 13, 2016 at 2:39 PM, Way, No notifications@github.com wrote:
|
As mpvl said, this has been moved to x/text/secure/precis. Keep in mind that the API may still be changed or moved in other ways (eg. profiles moved out into the parent package). @mpvl: Could you update the milestone on this issue? Thanks. |
I don't know if any of the go core milestones make sense here. I've cleared the field to avoid confusion. |
Cool, thanks! Looking forward for BIDI rules. |
Update: context rules and bidi rule are now implemented. Performance has also been greatly enhanced. The package should be conform the spec now. It it is not, please let me know. |
I am currently writing a draft for the working group of PRECIS test vectors; I'll check the package against what I have so far and report back if anything is broken. |
Adds an implementation of the PRECIS framework (RFC 7564) and related profiles (RFC 7613, RFC 7700). Fixes golang#13336 Change-Id: I5db5601911c45c43fff9f44ca8d8142f795ad49d Reviewed-on: https://go-review.googlesource.com/17933 Reviewed-by: Marcel van Lohuizen <mpvl@golang.org>
I would like to see the
golang.org/x/text
packages contain a PRECIS implementation.PRECIS is a replacement for stringprep which allows for the classification of characters based on Unicode properties. For more information on PRECIS see RFC 7564 (txt)
I am filing this issue in accordance with the contributing document, and have started implementing a package which might be used for a base here: https://bitbucket.org/mellium/golang-x-text. I will be happy to keep iterating on this if it does look like something that would warrant inclusion in
x/text
. Feedback on the approach would be appreciated (though perhaps the mailing list is a better place for that? The guidelines are a bit vague when it comes to thex
packages and how their development is handled).The text was updated successfully, but these errors were encountered: