Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go/doc: expose block-parsing interfaces and constants #13001

Closed
binarycrusader opened this issue Oct 20, 2015 · 7 comments
Closed

go/doc: expose block-parsing interfaces and constants #13001

binarycrusader opened this issue Oct 20, 2015 · 7 comments
Milestone

Comments

@binarycrusader
Copy link
Contributor

While working on a way to address #101, it was discovered that some of the functions, constants, and structures offered in go/doc/comment.go are useful for parsing Go source code comments.

In particular, blocks(), block, op, opPara, opHead, and opPre.

Currently, consumers wanting to generate their own documentation or process Go source code are left with duplicating the existing logic found in go/doc/comment.go or adding special-case logic to that file itself for specific formats (undesirable). I would like to suggest that the aforementioned items be made public, or moved to a more suitable package for re-use by others.

Doing so would make it easier to add support for the generation of new documentation formats, such as manual pages, etc. and would likely provide a cleaner result.

If the desire is to keep these items where they are today, I would rename them as follows:

 blocks -> Blocks
 block -> Block
   block.op -> block.Op
   block.lines -> block.Lines
 op -> Op
 opPara -> OpPara
 opHead -> OpHead
 opPre -> OpPre

I am open to counter-suggestions as to how this logic should be exposed and I am willing to perform the work requested.

This bug was filed at the request of @adg

@adg adg changed the title go/doc: consider exposing block-parsing interfaces and constants go/doc: expose block-parsing interfaces and constants Oct 21, 2015
@adg
Copy link
Contributor

adg commented Oct 21, 2015

This seems reasonable to me, and the API surface is small.

But the name Op is poor. I'd prefer BlockType (or maybe BlockKind), so the new API might look like this:

func Blocks(text string) []Block

type Block struct {
        Type BlockType
        Lines []string
}

type BlockType int

const (
        Paragraph BlockType = iota
        Heading
        Preformatted
)

Thoughts from @griesemer, @bradfitz? Anyone else?

@rsc
Copy link
Contributor

rsc commented Oct 23, 2015

As the author of this code, I think it is a mistake to expand the API here. That hard-codes some of today's choices in ways that are left much more malleable by the current ToText/ToHTML API.

@binarycrusader
Copy link
Contributor Author

Thanks @rsc ; what would you suggest be done then to allow others to write code that generates other formats? The choice most people seem to reach for is to simply duplicate what's there since it can't be reused directly, which seems unfortunate. Is there a better alternative?

@rsc
Copy link
Contributor

rsc commented Oct 23, 2015

Duplicating what's there is preferable to committing to an API that we may regret. There are just not that many formats, and copy+modify is a perfectly acceptable strategy.

@rsc rsc added this to the Go1.6 milestone Oct 23, 2015
@binarycrusader
Copy link
Contributor Author

@rsc if exporting the interfaces is undesirable, would there be any objection to renaming the existing ones (maintaining lowercase so as to not expose them) per @adg 's suggestion? Or is that sort of change generally frowned upon. I only ask because at least then, when it's copied somewhere else, it would likely have more suitable naming and thus be easier to find later.

@jimmyfrasche
Copy link
Member

The API is indeed far from perfect, though I'd completely be okay if it were just exposed as is, as one who has just copied it into my own packages in the past.

Here's a rough sketch of an alternative API, that would make most tasks simpler and be a bit more idiomatic.

The names are largely immaterial, and almost certainly the first thing I thought of; the functionality and structure is what I'm suggesting.

type Node interface { //or Block or whatever
   //for headers Text() == Lines()[0]
   //for paragraphs Text() == strings.Join(Lines(), " ") 
   //    where each line in Lines() has been run through strings.TrimSpace
   //for preformatted Text() == strings.Join(Lines(), "\n"),
   //    where no line in Lines() contains the initial indent that marked this node as preformatted
   Text() string
   Lines() []string //each line in the node with type-specific processing as described above
   Original() string //the slice of the original text string with no parsing
}

type NodeHeader struct {...} //all these implement Node, none export anything

type NodePreformatted struct {...}

type NodeParagraph struct {...}

func Parse(text string) []Node

Type switching would replace op.

Lines() would be equivalent to the lines field on block.

Note that

func OriginalInvariant(text string) bool {
  s := ""
  for _, n := range Parse(text) {
    s += n.Original()
  }
  return s == text
}

would return true for all text, allowing line numbering and the like to be derived if necessary.

The Text() method (named like the method on bufio.Scanner for the same reason) would cover the majority of uses and Lines() and Original() would allow any others.

I didn't include a node() method in the interface because it doesn't matter. If someone wants to define their own Node and reparse certain nodes using Original, I say let them go nuts.

@binarycrusader
Copy link
Contributor Author

Based on @rsc 's comments, I'm closing my request. I will attempt to address this in a different manner.

@golang golang locked and limited conversation to collaborators Nov 4, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants