Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go/doc: Synopsis can return a sentence spanning multiple paragraphs and code blocks #31947

Open
dmitshur opened this issue May 9, 2019 · 3 comments
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@dmitshur
Copy link
Contributor

dmitshur commented May 9, 2019

The Synopsis function defines the logic to determine the synopsis, often used on package documentation. The logic is:

Synopsis returns a cleaned version of the first sentence in s. That sentence ends after the first period followed by space and not preceded by exactly one uppercase letter. The result string has no \n, \r, or \t characters and uses only single spaces between words. If s starts with any of the IllegalPrefixes, the result is the empty string.

Let "paragraph" mean a block of text separated from other text by a blank line (i.e., "\n\n"). The current synopsis logic means a sentence can span across multiple paragraphs. For example:

fmt.Println(doc.Synopsis(`This is a sentence that starts in the first paragraph

and it keeps going in the second paragraph

and ends in the third paragraph. This is the second sentence.`))

// Output: This is a sentence that starts in the first paragraph and it keeps going in the second paragraph and ends in the third paragraph.

(Playground link: https://play.golang.org/p/hSAetYyxkwa)

Perhaps we should consider changing the logic such that a sentence is not allowed to span multiple paragraphs.

From what I've observed, that is rarely used intentionally, but can happen unintentionally. For example, the current version of the github.com/rs/cors package has a very long synopsis that spans multiple paragraphs and includes code blocks:

$ go list -f '{{.Doc}}' github.com/rs/cors
Package cors is net/http handler to handle CORS related requests as defined by http://www.w3.org/TR/cors/ You can configure it by passing an option struct to cors.New: c := cors.New(cors.Options{ AllowedOrigins: []string{"foo.com"}, AllowedMethods: []string{http.MethodGet, http.MethodPost, http.MethodDelete}, AllowCredentials: true, }) Then insert the handler in the chain: handler = c.Handler(handler) See Options documentation for more options.

/cc @griesemer @julieqiu

Edit: Another one is github.com/peterhellberg/link:

$ go list -f '{{.Doc}}' github.com/peterhellberg/link
Package link parses Link headers used for pagination, as defined in RFC 5988 Installation Just go get the package: go get -u github.com/peterhellberg/link Usage A small usage example package main import ( "fmt" "net/http" "github.com/peterhellberg/link" ) func main() { for _, l := range link.Parse(`<https://example.com/?page=2>; rel="next"; foo="bar"`) { fmt.Printf("URI: %q, Rel: %q, Extra: %+v\n", l.URI, l.Rel, l.Extra) // URI: "https://example.com/?page=2", Rel: "next", Extra: map[foo:bar] } if resp, err := http.Get("https://api.github.com/search/code?q=Println+user:golang"); err == nil { for _, l := range link.ParseResponse(resp) { fmt.Printf("URI: %q, Rel: %q, Extra: %+v\n", l.URI, l.Rel, l.Extra) // URI: "https://api.github.com/search/code?q=Println+user%3Agolang&page=2", Rel: "next", Extra: map[] // URI: "https://api.github.com/search/code?q=Println+user%3Agolang&page=34", Rel: "last", Extra: map[] } } }

Edit 2: Another one is github.com/astaxie/beego/orm@v1.12.2:

$ go list -f '{{.Doc}}' github.com/astaxie/beego/orm
Package orm provide ORM for MySQL/PostgreSQL/sqlite Simple Usage package main import ( "fmt" "github.com/astaxie/beego/orm" _ "github.com/go-sql-driver/mysql" // import your used driver ) // Model Struct type User struct { Id int `orm:"auto"` Name string `orm:"size(100)"` } func init() { orm.RegisterDataBase("default", "mysql", "root:root@/my_db?charset=utf8", 30) } func main() { o := orm.NewOrm() user := User{Name: "slene"} // insert id, err := o.Insert(&user) // update user.Name = "astaxie" num, err := o.Update(&user) // read one u := User{Id: user.Id} err = o.Read(&u) // delete num, err = o.Delete(&u) } more docs: http://beego.me/docs/mvc/model/overview.md
@dmitshur dmitshur added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label May 9, 2019
@dmitshur dmitshur added this to the Go1.14 milestone May 9, 2019
@griesemer
Copy link
Contributor

Seems reasonable. Should perhaps run a small experiment and see if and which synopsis change, say for the std lib (I expect close to none to change).

@dmitshur
Copy link
Contributor Author

Sounds good. To make progress on this issue, we should run that experiment and confirm the results are favorable. We should be able to include a corpus with a number of third-party packages in the experiment, in addition to the standard library packages.

@andybons andybons added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. and removed NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels May 14, 2019
@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@dmitshur
Copy link
Contributor Author

I ran an experiment on the standard library and some modules that were recently published (by using the module index). The results seem favorable, and there are quite a few cases where the synopsis extends well beyond the first paragraph.

First, there is one diff in the standard library:

cmd@v1.15.3/internal/obj/ppc64/doc.go:
"Package ppc64 implements a PPC64 assembler that assembles Go asm into the corresponding PPC64 instructions as defined by the Power ISA 3.0B. This document provides information on how to write code in Go assembler for PPC64, focusing on the differences between Go and PPC64 assembly language."
→
"Package ppc64 implements a PPC64 assembler that assembles Go asm into the corresponding PPC64 instructions as defined by the Power ISA 3.0B."

It's a good change. It helps with a case in the cmd/internal/obj/ppc64 package where the sentence ends with "3.0B.", which isn't recognized as the end of a sentence (which itself a small issue, but it's out of scope here).

There are more diffs in the golang.org/x modules:

golang.org/x/crypto@v0.0.0-20201016220609-9e8e0b390897/otr/otr.go:
"Package otr implements the Off The Record protocol as specified in http://www.cypherpunks.ca/otr/Protocol-v2-3.1.0.html The version of OTR implemented by this package has been deprecated (https://bugs.otr.im/lib/libotr/issues/140)."
→
"Package otr implements the Off The Record protocol as specified in http://www.cypherpunks.ca/otr/Protocol-v2-3.1.0.html"

golang.org/x/crypto@v0.0.0-20201016220609-9e8e0b390897/pkcs12/internal/rc2/rc2.go:
"Package rc2 implements the RC2 cipher https://www.ietf.org/rfc/rfc2268.txt http://people.csail.mit.edu/rivest/pubs/KRRR98.pdf This code is licensed under the MIT license."
→
"Package rc2 implements the RC2 cipher"

golang.org/x/crypto@v0.0.0-20201016220609-9e8e0b390897/xtea/cipher.go:
"Package xtea implements XTEA encryption, as defined in Needham and Wheeler's 1997 technical report, \"Tea extensions.\" XTEA is a legacy cipher and its short block size makes it vulnerable to birthday bound attacks (see https://sweet32.info)."
→
"Package xtea implements XTEA encryption, as defined in Needham and Wheeler's 1997 technical report, \"Tea extensions.\""

golang.org/x/mobile@v0.0.0-20200801112145-973feb4309de/cmd/gobind/doc.go:
"Gobind generates language bindings that make it possible to call Go functions from Java and Objective-C. Typically gobind is not used directly."
→
"Gobind generates language bindings that make it possible to call Go functions from Java and Objective-C."

golang.org/x/net@v0.0.0-20201016165138-7b1cca2348c0/publicsuffix/list.go:
"Package publicsuffix provides a public suffix list based on data from https://publicsuffix.org/ A public suffix is one under which Internet users can directly register names."
→
"Package publicsuffix provides a public suffix list based on data from https://publicsuffix.org/"

golang.org/x/net@v0.0.0-20201016165138-7b1cca2348c0/webdav/litmus_test_server.go:
"This program is a server for the WebDAV 'litmus' compliance test at http://www.webdav.org/neon/litmus/ To run the test: go run litmus_test_server.go and separately, from the downloaded litmus-xxx directory: make URL=http://localhost:9999/ check"
→
"This program is a server for the WebDAV 'litmus' compliance test at http://www.webdav.org/neon/litmus/ To run the test:"

golang.org/x/sys@v0.0.0-20201018230417-eeed37f84f13/windows/mkwinsyscall/mkwinsyscall.go:
"mkwinsyscall generates windows system call bodies It parses all files specified on command line containing function prototypes (like syscall_windows.go) and prints system call bodies to standard output."
→
"mkwinsyscall generates windows system call bodies"

golang.org/x/tools@v0.0.0-20201017001424-6003fad69a88/cmd/benchcmp/doc.go:
"Deprecated: benchcmp is deprecated in favor of benchstat: golang.org/x/perf/cmd/benchstat The benchcmp command displays performance changes between benchmarks."
→
"Deprecated: benchcmp is deprecated in favor of benchstat: golang.org/x/perf/cmd/benchstat"

These changes seem favorable too. Most happen because the first sentence is missing a period. A few happen because there is a period but it is not recognized due to a quote character, or being mistaken for an acronym.

The benchcmp change is questionable, but its documentation doesn't follow the suggested style of starting a package comment with "Package [name] ..."; the deprecated paragraph needs to be moved below.

In third-party packages, the vast majority of instances have a zero diff (9505 .go files out of 1070 unique modules). There are instances where the diff is non-zero (91 .go files out of 1061 unique modules). The changes overall look favorable. In many cases, they help reduce an accidentally very long synopsis to one that is much shorter and more readable. In the remaining cases, it turns a poor synopsis into a slightly shorter but also poor synopsis. You can see the raw data here.

Based on this data, I think it's reasonable to move this issue to NeedsFix state. @griesemer Do you agree?

@dmitshur dmitshur added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Oct 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

4 participants