Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: make "byte offset on line" error position info available #10324

Closed
bruno-medeiros opened this issue Apr 3, 2015 · 22 comments
Closed
Milestone

Comments

@bruno-medeiros
Copy link

When a Go source file has an error, only the line number is reported by the go compiler. So for example, with:

package other
import "fmt"
func OtherHello2() {
    fmtxxx.Printf("hello, world from other\n")
}

the Go compiler error message:
other\blah.go:4: undefined: fmtxxx
This is okay for command line usage, but when Go is invoked by editors and IDEs, it would be an improvement if a full source range was reported - that is, the start offset (line+column) of the error, and the end offset (line+column).

As an example, here's the error output from the Rust compiler for a similar error:

src\main.rs:8:5: 8:15 error: macro undefined: 'printlnxxx!'
src\main.rs:8     printlnxxx!("Hello world");
                  ^~~~~~~~~~

The Rust compiler goes even further and prints the range in the command-line UI too. That's not required though, what I'm asking is just the extra offset info reported in the first line. "8:5" is the start of the error range and "8:15" the end of the range. Reporting the error location as an absolute character offset (as opposed to line+column), would also work fine.

With this, IDE's can reported errors as squiggly lines in a source editor. Example, see how "xxx" is underlined:

sample_basic thumb
As a sidenote, https://github.com/GoClipse/goclipse in particular already supports this functionality, but it needs the compiler to report the full source ranges.

@ianlancetaylor ianlancetaylor changed the title cmd/go: Provide full source range for compiler errors [enhancement] cmd/gc: Provide full source range for compiler errors [enhancement] Apr 3, 2015
@ianlancetaylor
Copy link
Contributor

I'll note that both gccgo and the go/parser package track column information. I'm not sure either tracks a complete range, though.

@robpike
Copy link
Contributor

robpike commented Apr 4, 2015

What is a column?

@griesemer
Copy link
Contributor

filename:line:column: msg

where column is counted in bytes (at least for go/parser)

The go/ast has full range information for each node.

@bruno-medeiros
Copy link
Author

What is a column?

The "complement" of a line: a position (usually one-based index) counting from the last newline character. As opposed to a position counting from the start of the file (an absolute offset).
It's called a column since that's where the position would appear if the source is viewed in an editor.

BTW, it would be preferable to report Unicode character positions rather than raw byte positions.

@cespare
Copy link
Contributor

cespare commented Apr 8, 2015

It's called a column since that's where the position would appear if the source is viewed in an editor.

I suspect Rob was making the point that "column" only makes sense if the text layout is a grid (i.e. fixed-width fonts). A few people, including @robpike, are known for using proportional fonts in their editors. In this context "column" is less meaningful.

@robpike
Copy link
Contributor

robpike commented Apr 8, 2015

Not just that. Some characters, such as those in some Asian languages are wider than Latin characters. The concept of column is parochial.

I deeply object to the noise that the proposed style of error reporting introduces but it seems likely I will lose this fight. The best I can do is point out its technical inadequacies.

@robpike
Copy link
Contributor

robpike commented Apr 8, 2015

Also: The idea of Unicode character position is also poorly defined. The concept of character is poorly defined, and then the concept of column is poorly defined on top of that.

@minux
Copy link
Member

minux commented Apr 8, 2015

as cmd/gc doesn't currently track column (or byte, rune offset)
information, it's hard to fix this issue without big changes. And
I don't think the benefit outweighs the complexity for cmd/gc.
If someday cmd/gc starts to use go/* packages for its frontend,
then the issue should be much easier to fix.

For IDEs, it's much better to use go/parser and go/types for this
purpose.

@bruno-medeiros
Copy link
Author

I suspect Rob was making the point that "column" only makes sense if the text layout is a grid (i.e. fixed-width fonts). A few people, including @robpike, are known for using proportional fonts in their editors. In this context "column" is less meaningful.

"column" is just the term that it is called because its what it looks like on fixed-width fonts. Yes there are no columns in non-fixed-width fonts, but the term and the value are still useful. Just as software "bugs" are not actual insects, and when we save files we no longer save them in "floppy disks", but that is still the icon most graphical applications use. You can call it what you want, as long as the tool gives the value I want... :)

I did give a definition of what I meant by "column" in my previous post, but I will clarify it:
The "column" of the start position of the error is:

  • The byte index of the character where the error starts, counting from the last newline (or from the beginning of the file if no newline exists before the error). This index can be a zero-based index, or a one-based index. (Most tools of other languages report it as one-based. For IDEs it makes no difference as it can easily be converted to the other)

Alternatively, instead of a byte index, a Unicode code point index could be provided. This is preferable.

@bruno-medeiros
Copy link
Author

Also: The idea of Unicode character position is also poorly defined. The concept of character is poorly defined, and then the concept of column is poorly defined on top of that.

My usage of "character" is not poorly defined: I was using the same definition that the Go language specification uses:

"For simplicity, this document will use the unqualified term character to refer to a Unicode code point in the source text. "
https://golang.org/ref/spec#Introduction

A source text can be viewed as a sequence of Unicode code points. When I mention code point index (or position), I mean the index in that sequence where said code point occurs. This should clarify it.

As for the term column, clarified it in the previous post.

@rsc
Copy link
Contributor

rsc commented Apr 10, 2015

This also breaks badly for generated code. If //line is being used, the compiler should suppress "column" information.

@rsc rsc added this to the Unplanned milestone Apr 10, 2015
@rsc rsc changed the title cmd/gc: Provide full source range for compiler errors [enhancement] cmd/gc: make error column info available for punch card IDEs Apr 10, 2015
@rsc rsc changed the title cmd/gc: make error column info available for punch card IDEs cmd/compile: make error column info available for punch card IDEs Jun 8, 2015
@bruno-medeiros
Copy link
Author

@rsc "for punch card IDEs" ?... Is this supposed to be a joke? 😮

Virtually all modern IDEs and code editors support the display of errors overlayed in the source editor, in the form of a line subrange (which means they need a start column, possibly an end column as well). For example, Eclipse, IntelliJ, Visual Studio Code, Atom, Visual Studio , and likely many others that I didn't bother searching for screenshots. I'm fairly sure even the ancient Vim and Emacs understand the concept of columns as well, with regards to error messages (although I didn't check).

So, if the implication is that all those IDEs and editors above are archaic, I'd really like to know what wonderful tool y'all are using to edit source code...

@griesemer
Copy link
Contributor

@bruno-medeiros Regarding your question: Virtually nobody in the Go team uses an IDE which explains the lack of enthusiasm for more detailed position information. Don't take this as a value judgement, it's just a fact and a preference. We understand that a lot of people prefer to use some sort of IDE.

That said, as I have pointed out before, the go/ast provides full range information.

There is some not insignificant cost associated with tracking exact position information in a compiler, and the existing syntax tree in the compiler doesn't do it. There some significant engineering required to make it work correctly and reasonably efficiently. It may be done at some point, but it's not a high priority, hence the milestone of "unplanned".

@rsc rsc changed the title cmd/compile: make error column info available for punch card IDEs cmd/compile: make "byte offset on line" error position info available Apr 14, 2016
@rsc
Copy link
Contributor

rsc commented Apr 14, 2016

To respond to an earlier comment, if we do this it should be byte offset, not Unicode code point offset. There's almost no Go setting where Unicode code point makes sense: byte offsets give you efficient random access into a UTF-8 representation of the source, code point offsets give that up, and neither actually matches characters on the screen (due to complications like combining accents). I do understand that Unicode code point offset makes sense in Java because that's the native string representation and so it would give efficient random access in that setting. But Go should target Go, not Java.

@bruno-medeiros
Copy link
Author

@griesemer Thanks for the answer. It does make sense that there would be little interest in this functionality if none of the Go team is using an IDE (or graphical editor). As for efficiency though, wouldn't it be okay if the detailed tracking was only enabled with a compiler option? This way, an implementation that wasn't as efficient as ideally possible could be allowed in.

@rsc A byte offset instead of code point would be very welcome too. The conversion back to code point could always be done back in the IDE, and it would be relatively straightforward.

@mvdan
Copy link
Member

mvdan commented Apr 25, 2016

@griesemer even if not using an IDE, many text editors can take filename:line:col to open a file at the desired position. I use this very often to quickly open files and fix small mistakes when compiling.

Most of the time the line number is enough, but at times it can be confusing. Example: https://play.golang.org/p/eFauWxodFX

This may sound silly since you can very quickly see the func signature and figure out which of the two s is causing the error. But in the more general case, it might not be as obvious. Having the compiler point at the line's byte offset for the identifier would be useful.

@griesemer
Copy link
Contributor

@mvdan Thanks for the example. Again, I think we all understand the rationale, and again, go/parser and friends provide column information (byte offset from start of line).

As pointed out before, there's some significant plumbing required to make this work. I agree it would be nice to have. It's simply unplanned for now because we have more important things to work on at the moment.

@p0nce
Copy link

p0nce commented May 4, 2016

To add 2 cents, it was added in the D compiler against Walter's opinion ("slow downs lexer") and some debate. It turned out to be less noisy than expected and useful for various editor/IDE integration. I think people would defend column information with their teeth now.

@robpike
Copy link
Contributor

robpike commented May 4, 2016

Which teeth? Be specific. https://en.wikipedia.org/wiki/Universal_Numbering_System :)

@gopherbot
Copy link

CL https://golang.org/cl/34244 mentions this issue.

gopherbot pushed a commit that referenced this issue Dec 9, 2016
…hen rune count for column value

This will only become user-visible if error messages show column information.
Per the discussion in #10324.

For #10324.

Change-Id: I5959c1655aba74bb1a22fdc261cd728ffcfa6912
Reviewed-on: https://go-review.googlesource.com/34244
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
@griesemer
Copy link
Contributor

We now have precise position information (line# and byte offset per line). Should provide a mechanism to enable/disable byte offsets in error messages (and decide on a default).

@gopherbot
Copy link

CL https://golang.org/cl/37970 mentions this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants