Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debug/dwarf: 0-length array problem #3505

Closed
kylelemons opened this issue Apr 10, 2012 · 30 comments
Closed

debug/dwarf: 0-length array problem #3505

kylelemons opened this issue Apr 10, 2012 · 30 comments
Labels
FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@kylelemons
Copy link
Contributor

I'm attempting to wrap libusb so I can interface with my kinect.  I have basic control
working and have written the bulk and interrupt code, but isochronous requires using the
asynchronous API, which cgo seems to be having trouble with.

Consider the following code:

/* iso.go */
package usb

// #include <libusb-1.0/libusb.h>
import "C"

type Transfer C.struct_libusb_transfer
/* end */

The struct is documented here:
http://libusb.sourceforge.net/api-1.0/structlibusb__transfer.html

and defined here (see line 765):
http://libusb.sourceforge.net/api-1.0/libusb_8h_source.html

notably:
{...
struct libusb_iso_packet_descriptor iso_packet_desc[0];
...}

and cgo reports:
iso.go:7:15: struct size calculation error off=72 bytesize=64

$ gcc --version
i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM
build 2335.15.00)
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ go version
go version go1

$ go env
GOROOT="/opt/go"
GOBIN=""
GOARCH="amd64"
GOCHAR="6"
GOOS="darwin"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOTOOLDIR="/opt/go/pkg/tool/darwin_amd64"
GOGCCFLAGS="-g -O2 -fPIC -m64 -pthread -fno-common"
CGO_ENABLED="1"
@rsc
Copy link
Contributor

rsc commented Apr 10, 2012

Comment 1:

The DWARF format uses 1-indexed array indices, so it cannot
distinguish between a 0-length array and a 1-length array.
Instead we try to tell them apart using the offset and size
information associated with the struct field and the surrounding
struct.  It is clearly not working in this case.
If you would like to try to fix this, the relevant code is in
src/pkg/debug/dwarf/type.go:
        if t.Kind != "union" {
            b, ok := e.Val(AttrByteSize).(int64)
            if ok && b*8 == lastFieldBitOffset {
                // Final field must be zero width.  Fix array length.
                zeroArray(lastFieldType)
            }
        }
The question is why that's not firing for the field in question,
assuming it is at the end of the array.  If it is in the middle of
the array then the relevant code is a few lines up:
                if bito == lastFieldBitOffset && t.Kind != "union" {
                    // Last field was zero width.  Fix array length.
                    // (DWARF writes out 0-length arrays as if they were 1-length arrays.)
                    zeroArray(lastFieldType)
                }

@rsc
Copy link
Contributor

rsc commented Apr 10, 2012

Comment 2:

Labels changed: added priority-later, go1.1, removed priority-triage.

Status changed to Accepted.

@kylelemons
Copy link
Contributor Author

Comment 3:

Ah, cool.  I was definitely looking in the wrong place :).  I'll have a poke at this
tonight and see if I can't figure out why it's misbehaving.  I'll also try to duplicate
on linux/amd64.

@kylelemons
Copy link
Contributor Author

Comment 4:

The same code works fine on linux, so I suspect LLVM.
Here is a short repro:
package p
/*
struct s {
        int a;
        int b;
};
struct flex {
        long a;
        int b;
        struct s c[];
};
*/
import "C"
type (
        Flex C.struct_flex
)
Here is the progress so far on my debugging:
debug/dwarf sees the struct as follows:
[flex] []dwarf.Field{dwarf.Field{Attr:dwarf.AttrSibling, Val:0x15c},
dwarf.Field{Attr:dwarf.AttrName, Val:"flex"}, dwarf.Field{Attr:dwarf.AttrByteSize,
Val:16}, dwarf.Field{Attr:dwarf.AttrDeclFile, Val:2},
dwarf.Field{Attr:dwarf.AttrDeclLine, Val:8}}
[flex] BitO 0, Field &dwarf.StructField{Name:"a", Type:(*dwarf.IntType)(0x42323780),
ByteOffset:0, ByteSize:0, BitOffset:0, BitSize:0}
[flex] BitO 8, Field &dwarf.StructField{Name:"b", Type:(*dwarf.IntType)(0x42323900),
ByteOffset:8, ByteSize:0, BitOffset:0, BitSize:0}
[flex] BitO 12, Field &dwarf.StructField{Name:"c", Type:(*dwarf.ArrayType)(0x4231a640),
ByteOffset:12, ByteSize:0, BitOffset:0, BitSize:0}
[flex] LastBitO 12, AttrByteSize 16
cgo sees:
[flex] [0] a long int 8 (off=8)
[flex] [1] b int 4 (off=12)
[flex] [2] c typeof(struct s[1]) 8 (off=20)
The error is:
iso.go:18:7: struct flex size calculation error off=20 bytesize=16
If I add or remove fields from the struct, the error will sometimes go away.  It appears
as if LLVM is setting the struct size in DWARF to be word-aligned.
The only fix I can come up with is to always set the length of a terminal array to zero,
but this seems like it would have complications.

@kylelemons
Copy link
Contributor Author

Comment 5:

Okay, I was tricking myself by comparing dwarfdump with debug/dwarf.  It turns out it
doesn't distinguish between type[] and type[0] when printed out.  The following change
works for cgo:
diff -r 734071724054 src/pkg/debug/dwarf/type.go
--- a/src/pkg/debug/dwarf/type.go       Sat Apr 07 17:02:44 2012 +0400
+++ b/src/pkg/debug/dwarf/type.go       Tue Apr 10 22:27:09 2012 -0700
@@ -332,7 +332,7 @@
                        switch kid.Tag {
                        case TagSubrangeType:
                                max, ok := kid.Val(AttrUpperBound).(int64)
-                               if !ok {
+                               if !ok || max == 0 {
                                        max = -2 // Count == -1, as in x[].
                                }
                                if ndim == 0 {
but appears to break the my_struct1 test.  I have a test in misc/cgo/tests that tests
both [] and [#] cases, which appears to work correctly, so I'm at a loss as to why
anything here changed.

@kylelemons
Copy link
Contributor Author

Comment 6:

Nevermind; type[], type[0], and type[1] all appear indistinguishable.
I'm out of ideas.

@rsc
Copy link
Contributor

rsc commented Apr 11, 2012

Comment 7:

Yes, by themselves.  But as part of a struct, you can tell them apart
by looking at how far away the next struct field (or end of the struct) is.

@kylelemons
Copy link
Contributor Author

Comment 8:

I'll have to take a closer look, but I think that (with the llvm on os x) I could tell
the difference between [0] and [1] that way but [] could look like one, the other, or
neither depending on the size of the struct.  It was often four bytes away from the end
of the struct even for a 12-byte element type.

@kylelemons
Copy link
Contributor Author

Comment 9:

Here’s what I see from GCC: (on linux)
  struct issue_3505_inner {a int@0; b int@4} size=8
  struct issue_3505_zero {a long int@0; b int@8; c [-1]t_issue_3505_inner@12} size=16
  struct issue_3505_flex {a long int@0; b int@8; c [-1]t_issue_3505_inner@12} size=16
  struct issue_3505_one {a long int@0; b int@8; c [1]t_issue_3505_inner@12} size=24
The structure and bit offsets look fine; I expect [] and [0] to be the same.
The sizes are off, however.
The DWARF code assumes that the last field should have size zero if it’s flexible, but
that’s not the case.
The only reason DWARF knows these are flexible are the lack of upper bounds for the
dimensions.
Here’s what I see from LLVM: (on mac)
  struct issue_3505_inner {a int@0; b int@4} size=8
  struct issue_3505_zero {a long int@0; b int@8; c [1]t_issue_3505_inner@12} size=16
  struct issue_3505_flex {a long int@0; b int@8; c [1]t_issue_3505_inner@12} size=16
  struct issue_3505_one {a long int@0; b int@8; c [1]t_issue_3505_inner@12} size=24
Notice that everything we don’t compute is the same.
The difference here is that a dimension (0, unfortunately) is specified in all cases.
This should fall back to the code that looks if the size is the same as the last offset,
but this isn’t the case for this structure (in neither LLVM nor GCC).
My proposal:
Assume that any terminal array of size 1 or fewer is a variable length array.
 1. Before C90, the incantation seems to have been struct { type field[1]; }
 2. With C90, the somewhat more explicit form struct { type field[0]; } seems to be allowed
 3. With C99, the explicit form struct { type field[]; } was added
This will have no effect on nonterminal arrays or where the terminal array has 2+
elements.
This will adversely affect one-length terminal arrays in cgo which are accessed directly
(see below).
This will interpret both LLVM and GCC DWARF-2 output for the same C code identically.
To work around the adverse effect above, we could remove bounds checking from cgo for
flexible arrays.
I think with the above two changes, we'd be able to handle this odd case and still be
backward compatible.
Thoughts?

@jpoirier
Copy link
Member

Comment 10:

Just wondering if anyone has looked into this since the last post? I tried to compile
Kyle's gousb package on OSX today but ran into this problem; the quick hack was to
change the struct's flexible member size to 1 in libusb.h.
AFAIK, a zero length array (struct { type field[0]; }) has never been valid C code for
an ANSI compliant compiler, it's a GCC extension. C99 introduced the empty array size
for flexible members but prior to C99 the array size would be set to 1 when using the
struct hack, for compiler's other than GCC that is.
Question, is the statement "DWARF writes out 0-length arrays as if they were 1-length
arrays." in type.go strictly DWARF version 2; the GCC asm output with debug info that I
get contradicts this statement. My output shows the zero length flexible members as size
0 and the one length flexible member as non-zero. There's a gist with snippets of the C
code structures, all three variants, and the asm output here
"https://gist.github.com/3764752" where DW_AT_byte_size shows the byte size of the
structure, including the flexible member.

@kylelemons
Copy link
Contributor Author

Comment 11:

Notice the structs I used.  Stick in a long so it's more than a word wide, stick in an
int to throw off alignment, and then drop in the array.
$ cat > test.c
int main() {}
struct ArrEmpty {
    long big;
    int  small;
    int  arr[];
} empty;
struct ArrZero {
    long big;
    int  small;
    int  arr[0];
} zero;
struct ArrOne {
    long big;
    int  small;
    int  arr[1];
} one;
struct ArrTwo {
    long big;
    int  small;
    int  arr[2];
} two;
$ gcc -gdwarf-2 -S -fverbose-asm -c test.c
$ egrep "AT_(name|byte_size)" test.s | grep -A1 Arr
        .ascii   "ArrEmpty"             ## DW_AT_name
        .byte   16                      ## DW_AT_byte_size
--
        .ascii   "ArrZero"              ## DW_AT_name
        .byte   16                      ## DW_AT_byte_size
--
        .ascii   "ArrOne"               ## DW_AT_name
        .byte   16                      ## DW_AT_byte_size
--
        .ascii   "ArrTwo"               ## DW_AT_name
        .byte   24                      ## DW_AT_byte_size

@jpoirier
Copy link
Member

Comment 12:

I'm not sure I understand what it is your saying?
In your example the compiler adds tail padding for alignment purposes but the zero
length arrays still have a size of zero, tail_padding = sizeof(struct ArrEmpty) -
offsetof(struct ArrEmpty, arr), and sizeof empty.arr would return a value of 0 (is it
valid to call the sizeof macro on a 0-length flexible array member?).
For flexible array members, the C99 spec states that the size of the structure is as if
the flexible array member were omitted and if the array has no elements it behaves as if
it had one element - I'm guessing GCC's spec states something similar for its flexible
array member. It's possible for the C99 behavior statement to be misinterpreted as
implying a flexible array member has the size of one element when in fact its size is
zero. The DWARF spec, which I quickly browsed over, didn't seem to have much to say
about the subject.

@rsc
Copy link
Contributor

rsc commented Dec 10, 2012

Comment 13:

Labels changed: added size-l.

@gopherbot
Copy link

Comment 14 by pj@born2code.net:

The work around proposed by Joseph works for me. To help others that don't have a huge C
background, here are the steps I followed:
1. Open `libusb.h` from `/usr/local/include/libusb-1.0/libusb.h`.
2. Change [0] to [1] at line #937.
Here is my gist of the changed `libusb.h`:
https://gist.github.com/4578277#file-libusb-h-L937

@kylelemons
Copy link
Contributor Author

Comment 15:

jdp: What my example was meant to show was that, with llvm, the DWARF output does not
provide enough detail to be able to distinguish the [], [0], and [1] cases.  Calling
sizeof() would work, but that cannot be done by simply reading the DWARF output.
The "simplest" way that I can think of to solve this once and for all would be to make
it a two-step process: the first one is the current process that useses DWARF to
determine the structs and their fields and generates C code to dumps sizeof() for each
struct and sizeof and offsetof each member.  This leveragaes two assumptions: (1) all
compilers will be able to generate DWARF output containing correct struct names and
field names and (2) all compilers will correctly implement sizeof() and offsetof().
For the record, I'm also using a patched libusb.h because maintaining runtime changes
was a pain.

@minux
Copy link
Member

minux commented Jan 23, 2013

Comment 16:

since gcc 4.6 and above supports -fdump-go-spec=out.go, we can use
that if available.

@jpoirier
Copy link
Member

Comment 17:

@kevlar  what you were saying became clear only after I went down the rabbit hole
myself. :(  I dug into the gcc compiler code and some code for a couple static analysis
tools and they all did multiple passes (something like 2 to 5 IIRC) in a process of
elimination fashion. It sounds like you're headed in a similar direction.
I had something partially working (2 out of the 3 combinations) on the Go side but its
been so long...

@kylelemons
Copy link
Contributor Author

Comment 18:

Yeah... I found that the only cross-platform solution is to modify the header file I was
using to say [1] instead of [].  I'm accessing it using unsafe anyway, so setting the
length myself is not a problem.

@rsc
Copy link
Contributor

rsc commented Mar 12, 2013

Comment 19:

Labels changed: added go1.1maybe, removed go1.1.

@robpike
Copy link
Contributor

robpike commented May 18, 2013

Comment 20:

Labels changed: added go1.2maybe, removed go1.1maybe.

@robpike
Copy link
Contributor

robpike commented Aug 16, 2013

Comment 21:

Deferring to 1.3.

Labels changed: added go1.3maybe, removed go1.2maybe.

@robpike
Copy link
Contributor

robpike commented Aug 20, 2013

Comment 22:

Labels changed: removed go1.3maybe.

@lukescott
Copy link

Comment 23:

I'm not sure if this is related or not, but I have having an issue with a union inside a
struct. Kyle thinks it is. My struct looks like this:
typedef enum {
    Bool = 1,
    Int32,
    Uint32,
    Number,
    String,
    Array,
    Function,
    Object,
    External,
} ValueType;
typedef struct {
    ValueType type;
    union {
        bool boolean;
        int32_t integer;
        uint32_t uinteger;
        double number;
        char* cstring;
        void* ptr;
    };
} Value;
And I am getting:
context.go:21:11: struct size calculation error off=37 bytesize=16
This has severely put a stop to my work.
I've tried re-ordering the fields to no avail.
https://groups.google.com/forum/#!topic/golang-nuts/rOiJb_H20JM

@lukescott
Copy link

Comment 24:

I'm not sure if this is related or not, but I am having an issue with a union inside a
struct. Kyle thinks it is. My struct looks like this:
typedef enum {
    Bool = 1,
    Int32,
    Uint32,
    Number,
    String,
    Array,
    Function,
    Object,
    External,
} ValueType;
typedef struct {
    ValueType type;
    union {
        bool boolean;
        int32_t integer;
        uint32_t uinteger;
        double number;
        char* cstring;
        void* ptr;
    };
} Value;
And I am getting:
context.go:21:11: struct size calculation error off=37 bytesize=16
This has severely put a stop to my work.
I've tried re-ordering the fields to no avail.
https://groups.google.com/forum/#!topic/golang-nuts/rOiJb_H20JM

@rsc
Copy link
Contributor

rsc commented Nov 27, 2013

Comment 25:

Labels changed: added go1.3maybe.

@rsc
Copy link
Contributor

rsc commented Dec 4, 2013

Comment 26:

Labels changed: added release-none, removed go1.3maybe.

@rsc
Copy link
Contributor

rsc commented Dec 4, 2013

Comment 27:

Labels changed: added repo-main.

@ianlancetaylor
Copy link
Contributor

Is this still a problem? I can't get the test case in comment 4 to reproduce with any combination of Go version or compiler. I don't see any other complete test cases here.

If this is still a problem, we need instructions on how to reproduce it. Thanks.

@agnivade
Copy link
Contributor

Just tested with latest 1.9.2. No issues seen. Can this be an artifact of older Go versions ?

@agnivade agnivade added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Sep 24, 2018
@gopherbot
Copy link

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@golang golang locked and limited conversation to collaborators Oct 24, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

9 participants