Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/text/number: provide way to query number system #53872

Open
golightlyb opened this issue Jul 14, 2022 · 14 comments
Open

x/text/number: provide way to query number system #53872

golightlyb opened this issue Jul 14, 2022 · 14 comments

Comments

@golightlyb
Copy link
Contributor

golightlyb commented Jul 14, 2022

(edited to reduce scope)

/x/text/language lets you query Region and Script:

func (t Tag) Region() (Region, Confidence)
func (t Tag) Script() (Script, Confidence)

There should also be a way to query the Number System that Go has matched for the locale. Currently, this can be done with the Extensions method on a language.Tag but only if the number system has been specified explicitly in the locale string. Otherwise, there is no easy way to know what the default Number System chosen is.

This is probably best implemented by golang.org/x/text/number exporting a function called "SystemFromTag". This should return a number.System, which should support the stringer interface.

This is already done in /x/text/internal/number (InfoFromTag), and would be a trivial wrapper.

In future, a number.System could export other useful information that the internal number.InfoFromTag exposes, but this is not necessary for now.

Use case

This would let client code match the number system selected for a locale by /x/text, and if it needs any more information, look it up in the Unicode CLDR data files, which is a simple file at supplemental/numberingsystems.xml.

Without this, there's the much more involved process of reimplementing parsing the locale string, and re-implementing the mapping of a locale to its default number system, including the hierarchy of parent locales.

Example Usage

package main

import (
    "fmt"

    "golang.org/x/text/language"
    "golang.org/x/text/message"
    "golang.org/x/text/number"
)

func main() {
    ts := []language.Tag{
        language.MustParse("en-GB"),
        language.MustParse("en-GB-u-nu-fullwide"),
        language.MustParse("ar"),
        language.MustParse("ar-u-nu-latn"),
        language.MustParse("ta"),
        language.MustParse("ta-u-nu-taml"),
        language.MustParse("ta-u-nu-tamldec"),
    }

    for _, t := range ts {
        fmt.Printf("%s\n", t.String())

        r, _ := t.Region()
        fmt.Printf("%s, %s\n", r.String(), r.ISO3())

        s, _ := t.Script()
        fmt.Printf("%s\n", s.String())

        message.NewPrinter(t).Println(number.Decimal(123456789))

        // PROPOSED:
        // n, _ := number.SystemFromTag(t)
        // fmt.Printf("%s\n", n.String())

        fmt.Println("---")
    }


    // Expected Outputs:
    // en-GB
    // GB, GBR
    // Latn
    // 123,456,789
    // latn
    // ---
    // en-GB-u-nu-fullwide
    // GB, GBR
    // Latn
    // 123,456,789
    // fullwide
    // ---
    // ar
    // EG, EGY
    // Arab
    // ١٢٣٬٤٥٦٬٧٨٩
    // arab
    // ---
    // ar-u-nu-latn
    // EG, EGY
    // Arab
    // 123,456,789
    // latn
    // ---
    // ta
    // IN, IND
    // Taml
    // 12,34,56,789
    // latn
    // ---
    // ta
    // IN, IND
    // Taml
    // 12,34,56,789 // taml is not a decimal format, so ignore this line
    // taml
    // ---
    // ta-u-nu-tamldec
    // IN, IND
    // Taml
    // ௧௨,௩௪,௫௬,௭௮௯
    // tamldec
    // ---
}

Example implementation

/x/text/number should change as follows:

// System holds information about a numbering system
type System struct {
    info number.Info // from /x/text/internal/number
}

// SystemFromTag returns a Numbering System for the given language tag.  If it
// was not explicitly given (e.g. "en-u-nu-mathbold"), it will infer a most
// likely candidate. This is subject to change.
func SystemFromTag(t language.Tag) System, Confidence {
    // TODO select a Confidence
    return number.info.InfoFromTag(t), confidence
}

// String returns the BCP 47 U Extension representation for the Number System Identifier.
func (s System) String() string {
    ....
}

Open questions

  1. The documentation for Region.String says it returns "ZZ" for an unspecified region. Script.String returns "Zzzz" for an unspecified script. Would SystemFromTag ever fail to return a numbering system? Could it in the future? If so, what should that system's string representation be? Probably just the default, with an appropriate "No" confidence value?

  2. /x/text/number currently doesn't support number system categories at all - e.g. "tamil-u-nu-native", "tamil-u-nu-traditio" or "zh-u-nu-finance" - only explicit matches e.g. "tamil-u-nu-tamldec". Should this be implemented first? It would probably impact the returned Confidence value. (See x/text/number: understands specific BCP-47 u-nu-extensions, but not general categories #54090)

References

  1. https://cldr.unicode.org/translation/core-data/numbering-systems
  2. https://www.unicode.org/reports/tr35/tr35-numbers.html#Numbering_Systems
@gopherbot gopherbot added this to the Proposal milestone Jul 14, 2022
@seankhliao seankhliao changed the title proposal: /x/text/number: promote IsDecimal, Digit, Symbol from internal proposal: x/text/number: promote IsDecimal, Digit, Symbol from internal Jul 14, 2022
@golightlyb golightlyb changed the title proposal: x/text/number: promote IsDecimal, Digit, Symbol from internal proposal: x/text/number: promote IsDecimal from internal Jul 14, 2022
@golightlyb golightlyb changed the title proposal: x/text/number: promote IsDecimal from internal proposal: x/text/number: promote IsDecimal from internal and add number system stringer Jul 16, 2022
@golightlyb
Copy link
Contributor Author

cc @mpvl as per https://dev.golang.org/owners

@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Jul 20, 2022
@rsc
Copy link
Contributor

rsc commented Jul 20, 2022

There is some overlap here with #19787 (decimal in language) and #30870 (decimal in database/sql). Probably we're not ready to move forward on either of those. The reason it's internal in x/text is to make sure any eventual public API is consistent with the general approach.

Putting on hold for the others (it will be a while).

@rsc rsc moved this from Incoming to Hold in Proposals (old) Jul 20, 2022
@rsc
Copy link
Contributor

rsc commented Jul 20, 2022

Placed on hold.
— rsc for the proposal review group

@golightlyb golightlyb changed the title proposal: x/text/number: promote IsDecimal from internal and add number system stringer proposal: x/text/number: provide way to query number system Jul 20, 2022
@golightlyb
Copy link
Contributor Author

'm not sure there's much overlap?

I've edited the proposal down.

It's about l11n, not Go's language support for numbers.

Hopefully this can go back into Incoming

@golightlyb
Copy link
Contributor Author

@rsc requesting this goes back into Incoming for next proposals review

@rsc
Copy link
Contributor

rsc commented Jul 27, 2022

Is the proposed API:

package number
func SystemFromTag(t language.Tag) (System, language.Confidence)

type System struct { ... }
func (System) String() string 

Is String really the only necessary method on number.System? language.Region has many, and language.Script has at least one other than String.

?

/cc @mpvl

@rsc rsc moved this from Incoming to Active in Proposals (old) Jul 27, 2022
@rsc
Copy link
Contributor

rsc commented Jul 27, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@golightlyb
Copy link
Contributor Author

golightlyb commented Jul 27, 2022

Yes - for now, but not forever

There are methods implemented at /x/text/internal/number, in particular Digit, IsDecimal, Symbol.

I'd like to see them, but perhaps at a later date - I don't want to block this deciding if they are ready to be exported. Although happy if that discussion happens.

Those methods are quite easy to extract from a single Unicode data file, whereas String() is difficult to duplicate (many data files, in a hierarchy, plus needing to re-implement Go locale parsing)

@rsc rsc removed the Proposal-Hold label Aug 3, 2022
@rsc
Copy link
Contributor

rsc commented Aug 3, 2022

Any thoughts about this API, @mpvl?

@rsc
Copy link
Contributor

rsc commented Aug 10, 2022

Does anyone object to this proposal? #53872 (comment)

@rsc
Copy link
Contributor

rsc commented Aug 17, 2022

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: x/text/number: provide way to query number system x/text/number: provide way to query number system Aug 17, 2022
@rsc rsc modified the milestones: Proposal, Backlog Aug 17, 2022
@gopherbot
Copy link

Change https://go.dev/cl/465735 mentions this issue: x/text/number: provide a way to query number system

@golightlyb
Copy link
Contributor Author

CC @thanm for /x/text reviews

@golightlyb
Copy link
Contributor Author

Friendly bump to see if we can get reviewers on this. CC @thanm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

No branches or pull requests

3 participants