-
Notifications
You must be signed in to change notification settings - Fork 18k
proposal: spec: find a way to export uncased identifiers #22188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This has been discussed before, but to note it on this issue: one possible approach would be to designate a specific Unicode character, not otherwise available for use in identifiers, to designate the identifier as exported. For example, if we use the character This then leads to another decision point. We could treat the |
In addition to what @ianlancetaylor said: There's also a third design choice (besides requiring the special Unicode character always for exported identifiers, or only inside the package that exports the identifier). The third choice is to only require the special Unicode character at the declaration; basically a marker indicating that the following (or perhaps preceding) identifier is exported. There's also an obvious drawback** with this choice which is that one won't be able to tell (at a use site) whether an identifier is exported or not by simply looking at it. ( ** In the past we have eschewed this idea). Regarding the choice of the special Unicode character: One could chose |
If we are contemplating a change to export rules anyway, we should evaluate whether the same mechanism could be used for fields of cgo-imported structs, as described in #13467. As far as I can tell, the constraints to address that use-case are:
A distinguished Unicode character only satisfies those constraints if it is not required for field references within the same package. |
maybe this can be solved in some other project's own coding rules // if you don't understand the requirement and abstract the concepts very well and cant' come up with good names // E for Export // or use a Getter some time some coder can't come up with a good name in English, this has nothing to do with the programming language. variable or function names in Chinese are not used so much in source code. other part of source code are still not Chinese: if, for, func, return. you don't use variable or function names in Chinese in C, C++ too. |
if they come to place $ to declare a variable, I will hate to go |
Why not just make go 2.x be explicit, using keywords like To support legacy code, you could make fields of unspecified visibility use the old, broken behavior so that code still compiles and runs as expected. |
@kstenerud We believe that the fact that one can immediately tell by looking at an identifier whether it is exported or private is a feature. We don't consider it to be broken behavior. In any case this is now so fundamental to Go code that it would not be feasible to change it at this point. |
In Oberon * or - were used as export marks, so using, for example . as the export mark is not unprecedented. (https://cseweb.ucsd.edu/~wgg/CSE131B/oberon2.htm) I feel that the go package system as well as the imports and exports are inspired by Oberon, but with the advantage that upper case identifiers are exported "automatically". Still, I think it is an omission that there can't be an explicit export mark as well. Furthermore, for interoperation between Go and other languages, it might be desirable to export a lower case identifier as well. The fact that one can immediately tell by looking at an identifier whether it is exported or private is only of limited use, since that is only so at the local package level. If the identifier comes from another package, normally, if not using dot imports, it is immediately clear that an identifier is imported, thanks to the package prefix. Therefore I think that the '.' prefix to signify an exported identifier like @griesemer proposes is the best solution for this problem. |
Why not keyword "export"? It's readable, explicit, and understandable even for new comers. Please do not give more special characters special meanings, which is incomprehensible at first glance. I also support @kstenerud 's opinion. |
Go is partially inspired on oberon-2, but it has a more C like syntax. Otherwise we would write begin in stead of { and end in stead of }, and pointer in stead of *. So special characters with special meaning is somewhat normal for Go. |
With the Go language version in go.mod, I think one option here is we could just make uncased identifiers exported iff they're in a package that uses Go 1.17 (or whatever -- I'm going to use 1.17 for concreteness). That is, This would mean users upgrading from Go 1.16 to Go 1.17 might need to rename their identifiers to prevent them from being exported (e.g., rename I'm pretty sure the compilers and reflect API can handle this fine. I'm a little worried about go/token.IsExported; e.g., I see go/doc and net/rpc use it, but maybe they can be handled some other way. |
As someone who writes lots of tooling that parses Go source files, there's a strong benefit to being able to understand what the source code for a single Go file means without needing to consult some other piece of information (i.e., the For example, |
That was proposed in 2013 but rejected. I think it is a good idea. But if that won't happen, I think it's better to keep the rule. 95% of Chinese programmers won't to use Chinese variable name. |
@crvv, I can see that not many Chinese Go programmers want to use Chinese function names or variable names up to now now, but perhaps this like the story of the fox that tries to eat grapes from a vine but cannot reach them? If it becomes possible, then I think we are likely to see more people who will want to use Go because now they are able to use it, as is suggested by @lych77 in #5763. Furthermore, there is also a problem of interoperating Go code with other programming languages. In some of those programming languages, the convention is that function or method names should be all lowercase. Therefore if would be convenient for that use case to allow certain non-exported identifiers to become exported, preferably by a |
Yes, there are some use cases where Chinese variable name are very helpful. #30572 (comment)
If the great feature is broken, why not just accept #30572 ? |
So: Export all uncased identifiers
Export sigil only used at declaration, like .成本
Export keyword
Export sigil used only in defining package
Export sigil always used, like $成本
Would anything other than lexical checks for exported-ness break in a user-facing manner? IsExported would only break in token/ast: the versions in go/types and reflect (soon to be added #41563) would continue to be precise. Would many tools be broken irreparably by this? Would it help to deprecate token/ast IsExported well before making a change? Could an explicit notation be limited to uncased identifiers and cased identifiers continue to use the current rules? (:+1:) |
I don't have enough knowledge on compiler / language design to tell if this is possibly a terrible solution. An option that has not been suggested before is "in line" with
it is a means to an end, no sigils, no keywords, no change in how the language handles exporting identifiers, more typing |
For CJK, I suggest the following rule that respects @rsc 's "opt-in" philosophy. Below is a concrete example:
|
For many Chinese syllables, the simplified is the same as the traditional. And in some cases you can't tell whether a syllable is traditional Chinese or simplified Chinese. |
Interestingly, lower case Roman characters are actually a simplification of upper case Roman characters, for ease of writing, so the relation is actually opposite of what is suggested here for CJK. Therefore it doesn't sound very realistic. |
@bjorndm your description above
is exactly what's going in Chinese, 简体 characters are actually a simplification of 繁體 characters, for ease of writing. @crvv It's true that the relationship between traditional and simplified is a many-to-one one. |
FWIW, I think we're leaving the era of where you can analyze Go source files without version information. E.g., go/types will soon provide per-file Go version information to facilitate the changing rules for As for go/token, we could add an API like:
and change the current top-level functions into deprecated wrappers for |
If you start the identifier with_ Or lowercase characters, identifiers are considered non exported symbols. If the identifier does not start with lowercase characters, the identifier is considered an exported symbol. Can we solve this problem? |
There are several problems with this approch. How to export a variable like 成本, which doesn't have different characters in traditional Chinese. How to tell a character is traditional or simplified? 騬 looks like public. But it may be private before unicode 13 and may become public after unicode 13. How to export Japanese words like ぶつける? |
@fumin, thank you for your suggestion. I would really like to see this issue resolved in a way that everyone can agree upon.
FWIW, you are only discussing C of CJK. The Korean writing system does not have any concept similar to upper or lower case. e.g as you mentioned, 简体 is a simplified version of 簡體 in Chinese. But in Korean, it's written as '간체' and there is no alternative form. In contrast, Japanese does not use simplified Chinese characters but has Shinjitai(新字体), which serves a similar purpose to 简体, but they are not interchangeable. In fact, most Japanese people do not use 简体 at all. In addition to these, Taiwan does not use simplified Chinese. They have only one character set.
In my opinion, mixing simplified and cumbersome one does not make sense, and actually hard to type. To do this, one might have to change IME very often because many people use only one character set at a time. |
For me the most annoying thing when programming in go is constantly having to search&replace in my code if I decide to export a field from a package. |
@ElectricPulse Several editors with gopls support now provide type-aware find&replace which make name changes (e.g. for export) trivial. |
This is a bodge. Why accepting this bodge as a solution is bad can be explained this way: The argument of: "The capitalization makes it clear what is exported" is invalid because you have two scenarios:
TLDR: This feature would make go code more adabtible to variable name changes with no compromise and make code more readable while freeing up capitalization for the user to decide. If you give me the greenlight I will start working on a pull request. |
@ElectricPulse My suggestion was intended as a well-meant work-around that may work for your specific situation. It was not meant as a "solution" for this issue. But regarding your comment: How exports are marked in Go is a fundamental feature of the language that has worked surprisingly well for > 15 years. Adding an "export" keyword or any other markers was extensively discussed when Go was designed and we decided against it. We are not going to change that. This issue is about making it possible for users of alphabets that have no obvious capitalization (CJK languages) to mark identifiers as exported without resorting to an "X" (or what have you) prefix. It is not about introducing another mechanism for languages that have capitalization. That would fundamentally change the look and feel of Go programs which we don't want. See also #5763 for a more complete discussion. Something like this idea might be doable now if coupled with the language version. You are of course free to experiment to your heart's content. But any change will require a proposal accepted by the wider Go community. Thanks. |
Since we now have a language version in go mod, it seems like we can implement the idea that identifiers that start with characters with certain Unicode properties such as Kanji, or other non cased writing systems are also exported. Maybe we should make a list of of such character ranges first. |
See in particular #5763 (comment). |
#5763 (comment) observes “It is very strange to use, say Z成本 or Jぶつける as identifiers.” In that issue we discussed potentially changing the default export rule, but as of #5763 (comment), which seemed to have general agreement, we decided against that.
Even so we do want to find a way to export uncased identifiers, or at least consider ways, in order to address the original observation.
This issue is for discussion of non-breaking ways to export uncased identifiers.
The text was updated successfully, but these errors were encountered: