Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile, cmd/link: generate dwarf type info in compiler #52209

Open
10 of 13 tasks
zhouguangyuan0718 opened this issue Apr 7, 2022 · 68 comments
Open
10 of 13 tasks

cmd/compile, cmd/link: generate dwarf type info in compiler #52209

zhouguangyuan0718 opened this issue Apr 7, 2022 · 68 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@zhouguangyuan0718
Copy link
Contributor

zhouguangyuan0718 commented Apr 7, 2022

Background

I have focused on dynamic link (plugin,linkshared,buildemode-shared) in golang for two years. I noticed that the dynlink library can't be debuged. And I know that the reason of it is the dwarf type info is generated in linker.
Related issues:
#38378
#44982

Later, I saw this in #47788 (comment).

@thanm said:

One thing we can do that would make the problem more tractable in general would be to move DWARF type DIE generation out of the linker and into the compiler (this is an idea that we've toyed with the in the past but haven't gotten to). Doing that would definitely help for the "link shared against libstd.so" use case.

Thanks for @thanm , I have tried to implement generating dwarf type info in compile for a long time. Now it can work, though it is a very very initial version. I think it is the correct time to submit a proposal about it, and maybe I can improve it and contribute it to go community.

Proposal

The dwarf type info can be generated in compiler instead of linker. My initial version implement it with this way:

  1. We can collect all the types (global var, local var, const and the type which runtime type should be keeped) used in current compile unit.
  2. Do "defgotype" like the linker now. And we don't need to decode the infomation from binary like linker, we only need to get the info from types.Type struct in compile.
  3. Synthesize map type, chan type and soon. Because some proto type (runtime.hchan, runtime.hmap...) are not in current compile unit, we must make some "stub" type of them.
  4. Before dumpobj in compiler, convert all the type dies to LSym and put them to data section.

I will send a CL to share my very very initial prototype.

If this proposal is a little likely to be accepted, I will implement it continue. I think thers is much things todo.
TODO:

  1. Clarify the dwarf symbol name and type name.
  2. Use aux sym for typedef sym instead of lookup it by name.
  3. Investigate why fixedbugs/ssue30908.go failed. It seems like there is some different duplicate dwarf symbols.
  4. Distinguish linkshared and static link. For linkshared, generate completely dwarf types for every compile unit. For static link, only generate dwarf types that defined in current compile unit.Then the user who use -linkshared, They could debug the dynlink library.

Costs

  • What is the compile time cost?
    Longer. But it will not much because of parallel compile. And maybe we can get a shorter link time.
  • What is the run time cost?
    No cost.
  • Can you describe a possible implementation?
    Above
  • Do you have a prototype? (This is not required.)
    Yes.

Update 2022.5.10:

List an intial version todo list for formal modification:

  • Implement some base interface for dwarf in compiler
  • generate dwarf info sym for the type (except the type need to be synthesized) defined in current compile unit in compier.
  • Create some type description for synthesize type in compiler, thus we can generate and synthesize more type info in compiler.
  • add some util functiong for synthesizing type.
  • synthesize ptr, hchan, hmap, string and slice in compiler.
  • use the dwarf sym generated in compiler in linker.
  • cleanup some redundancy and duplicate code
  • generate whole dwarf type info in compiler for dynlink.
  • do not generate the types which must be emited by runtime
  • add a new reloc type for using aux symbol for type info
  • refactor and optimize the DWDIE struct to reduce the pointer
  • unify the die create way in internal/dwarf
  • add testcases for these cls
@gopherbot gopherbot added this to the Proposal milestone Apr 7, 2022
@gopherbot
Copy link

Change https://go.dev/cl/398735 mentions this issue: cmd/compile,link: generating dwarf type info in compiler

@ianlancetaylor
Copy link
Contributor

There is no user visible change here (except that debug info is better), so this doesn't have to go through the proposal process. The linker/compiler/runtime team can decide what to do here.

CC @golang/runtime

@ianlancetaylor ianlancetaylor changed the title proposal: cmd/compile,link: generating dwarf type info in compiler cmd/compile, cmd/link: generate dwarf type info in compiler Apr 7, 2022
@ianlancetaylor ianlancetaylor added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. and removed Proposal labels Apr 7, 2022
@ianlancetaylor ianlancetaylor modified the milestones: Proposal, Backlog Apr 7, 2022
@zhouguangyuan0718
Copy link
Contributor Author

I will send CL tree to implement it formally.

@gopherbot
Copy link

Change https://go.dev/cl/399059 mentions this issue: cmd/link,compile: refacte dwarf generation of linker

@gopherbot
Copy link

Change https://go.dev/cl/399294 mentions this issue: cmd/internal/dwarf: define interface for dwarf type info

@gopherbot
Copy link

Change https://go.dev/cl/399062 mentions this issue: cmd/link: extract newtype1 function

@gopherbot
Copy link

Change https://go.dev/cl/399295 mentions this issue: cmd/link: complate decoupling for newtype

@gopherbot
Copy link

Change https://go.dev/cl/399063 mentions this issue: cmd/link,cmd/internal/dwarf: move newtype to dwarf package

@gopherbot
Copy link

Change https://go.dev/cl/399064 mentions this issue: cmd/compile: implement dwarf.Type interface for types.Type

@gopherbot
Copy link

Change https://go.dev/cl/399296 mentions this issue: cmd/internal/dwarf: clarify symbol name and dwarfname

@gopherbot
Copy link

Change https://go.dev/cl/399297 mentions this issue: cmd/internal/dwarf: clarify Type and Sym

@gopherbot
Copy link

Change https://go.dev/cl/399298 mentions this issue: cmd/link,obj/dwarf: unify DefGoType and DefPtrTo

@gopherbot
Copy link

Change https://go.dev/cl/399301 mentions this issue: cmd/internal/obj/dwarf: implement some base method for dwCtxt

@gopherbot
Copy link

Change https://go.dev/cl/399302 mentions this issue: cmd/compile: generate dwarf type info as aux sym of type

@thanm
Copy link
Contributor

thanm commented Apr 11, 2022

@zhouguangyuan0718 thanks for writing this up and sending a first set of patches. Let me know when they are ready for review.

Moving type generation from linker to compiler is definitely a worthy project. If I were doing this work myself, there are some other cleanups I would also consider folding in as well. Specifically:

  • the way the linker currently does type DIE generation is more "C-like" than it should be (basically since the code was originally written in C and then converted/translated to Go). The DWDie/DWAttr setup is also more memory-intensive than it needs to be (these structs also have a lot of pointers, making extra work for the GC).

  • it is also confusing to have two entirely separate ways of creating DWARF DIEs in the the toolchain (e.g. the code that emits subprogram and related DIEs has one way of creating the DWARF, and the linker code has an entirely different way via DWDie/DWAttr). Ideally we would want to unify these two.

With that said, these cleanups are independent and can be tackled separately at some future time.

@gopherbot
Copy link

Change https://go.dev/cl/399275 mentions this issue: cmd/link: use the dwarf type info generated by compiler

@zhouguangyuan0718
Copy link
Contributor Author

@thanm thanks for your reply. I'm glad to know it is a worthy project.
Now it can simply work with CL 399275 (with some bugs I’m investigating) and the relation chain of it. At this time, the dwarf type info is generated in both compiler and linker. In compiler, generate the type difined in current compile unit except the synthesize type. In linker, generate all the others info. I will move the remaining step by step.
To avoid my idea is on the wrong track, Could you take a little look for the current patchsets ?It begin with CL 399059. Thanks very much.

I want to cleanups something, too. I didn't add duplicate code in compiler, I moved some code from linker to cmd/internal/dwarf. We can use them in both compiler and linker. In the future, it can be unified more easy.

@thanm
Copy link
Contributor

thanm commented Apr 12, 2022

To avoid my idea is on the wrong track, Could you take a little look for the current patchsets ?It begin with CL 399059. Thanks very much.

I took a very quick skim. In general looks to be moving in the right direction.

I am not sure about https://go-review.googlesource.com/c/go/+/399302 however-- what is the purpose of making DWARF type die symbols into aux symbols attached to type symbols? If the type DIE symbol is an aux, it means you can't look it up by name. This seems to me that it will force a lot of rewriting/post-processing in the linker.

@zhouguangyuan0718
Copy link
Contributor Author

zhouguangyuan0718 commented Apr 12, 2022

it means you can't look it up by name.

Sorry, maybe I use a incorrect combine of aux and pkgDef for dwarf sym? It seems I can still look it up by name now. And I can also get them by aux of a type symbol.

what is the purpose of making DWARF type die symbols into aux symbols attached to type symbols?

According to the understanding I described above, I hope that all the dwarf type can be collected by using aux sym of reachable gotype and traverse the relocs of them at last. May be it is more fast than name lookup.

If I missed something which can prevent doing as this, I will put them in data.

@gopherbot
Copy link

Change https://go.dev/cl/399877 mentions this issue: cmd/compile: add a generator for synthesize dwarf type

@thanm
Copy link
Contributor

thanm commented Apr 12, 2022

It doesn't really make sense to have a symbol that is both aux and pkgdef -- the primary reason we make the other DWARF sub-symbols (ex: DWARFLINES, DWARFFCN) is that we never have to look them up by name. Better to have the DWARF type DIE symbols just be regular named symbols.

@zhouguangyuan0718
Copy link
Contributor Author

zhouguangyuan0718 commented Apr 12, 2022

Thank you for explanation. Maybe dealing with the reloction of them correctly is a little dificult. I met many problem about relocations at the beginning. Particularly, some dwarf type symbol are nameless and duplicate. However, I will try to make them as regular symbols continue.

The other thing I can't do with certainty is CL 399877. Should I do it like this? Or put them into cmd/compile/internal/typecheck/builtin, then fix mkbuiltin.go and deal with them by a common way? Could you give me some advice?

@gopherbot
Copy link

Change https://go.dev/cl/400874 mentions this issue: cmd/compile: do not dump incorrect prototype when using -linkshared

@zhouguangyuan0718
Copy link
Contributor Author

Hi! @thanm , I have refactored my work tree. I remove much transitional code. And make changes aggregated in every cl. I think it will be more easy to review.
Now, I think all the tasks I marked in the content of this issue are done and they are ready to be reviewed. You can review them if you are available sometimes. I will do the remain tasks continue. Thank you.

zhouguangyuan0718 added a commit to zhouguangyuan0718/go that referenced this issue Apr 20, 2022
Dwarf type info can be generated based on the abstract type. We can make it to be independent of specific implement. And define a TypeContext interface to provide some base method during dwarf type info generating.

For golang#52209

Change-Id: I4d8fd3c19bcb9a22cdb037fb5f2b05e1a5907e23
zhouguangyuan0718 added a commit to zhouguangyuan0718/go that referenced this issue Apr 20, 2022
Implement cmd/internal/dwarf.Type interface fpr cmd/compile/internal/types.Type with a wrapper.

For golang#52209

Change-Id: I46ce4735bfcf900d9e17c14b6f69ea542eac0a8a
zhouguangyuan0718 added a commit to zhouguangyuan0718/go that referenced this issue Apr 20, 2022
…r dwCtxt

Only generate type info for types defined in current compile unit. So only emit a reloc for subtype when reference it.

For golang#52209

Change-Id: If723e0d9588c1e82cf1a129ac4c8ec7eb3b1d98
zhouguangyuan0718 added a commit to zhouguangyuan0718/go that referenced this issue Apr 20, 2022
To generate dwarf type info in compiler, I copy some functions in cmd/link/internal/ld/dwarf.go to cmd/internal/dwarf, and  refactor them to be independent of linker. No functional change now. They will be used in compiler later.
And the duplicate code in linker will be removed at last.

For golang#52209

Change-Id: I10c9bbaf530f84f3fe4d94687a9325b665a7a9ce
zhouguangyuan0718 added a commit to zhouguangyuan0718/go that referenced this issue Apr 20, 2022
cmd/link/internal/ld/dwarf.go:/^func.newtype --> cmd/internal/dwarf/dwarf.go:/^func.NewType

And refactor it to adapt Type interface. newtype in cmd/link/internal/ld/dwarf.go will be removed at last.

For golang#52209

Change-Id: Ie95eff712418ee454a9c3887dcd7e7488a42b011
zhouguangyuan0718 added a commit to zhouguangyuan0718/go that referenced this issue Apr 20, 2022
Generate dwarf type info and dump it to objfile. It is not be used in linker now.
Some types need to be synthesized. I will resolve it in next CLs.

For golang#52209

Change-Id: Ie7c67197f930462ae4474f291e02d3009b6651bc
zhouguangyuan0718 added a commit to zhouguangyuan0718/go that referenced this issue Apr 20, 2022
These functions are copied from cmd\link\internal\ld\dwarf.go. Refactor them to adapt Type and TypeContext interface.

For golang#52209

Change-Id: I27831f65f998f5bc100182f598be25870ac916da
zhouguangyuan0718 added a commit to zhouguangyuan0718/go that referenced this issue Apr 20, 2022
These code is also copied and refactored from cmd/link/internal/ld/dwarf.go

For golang#52209

Change-Id: I58f6c401898c2d3350fdf51d17ca059ec829dfa5
zhouguangyuan0718 added a commit to zhouguangyuan0718/go that referenced this issue Apr 20, 2022
To synthesize dwarf type in compiler, the prototype of string, map, slice, channel is needed. It can't be easily got when compile the code out of runtime. So use add them to builtin so we can get the prototype information.

More detailed explanation:
When a map(eg: map[int]int) is used, we need more dwarf type info than we can see in language level. The pseudo type hash<int,int>, bucket<int,int>, []val<int> and []key<int> will be generated for debug more easy. They are "template" type, we need to fill them when a map is instantiated. And we can't define these type directly by exist go runtime type. So they should be added to buitin for filling the "template" when generating the dwarf type info of them.
So does others.

See golang#52209 (comment)

For golang#52209

Change-Id: Ifcb1e15d2300323980b320673adb67df2ee07956
zhouguangyuan0718 added a commit to zhouguangyuan0718/go that referenced this issue Apr 20, 2022
Now, we can synthesize types in compiler. And all the dwarf type sym can be kept by relocs. subtype in linker can be removed now. There is some undifined relocs to runtime type in dwarf type.
These relocs are added by compiler, it don't know the corresponding runtime type will be defined or not. So skip them in relocsym.

For golang#52209

Change-Id: Ief7683f8d9033ebf0f682f582e2fffde63587bf0
zhouguangyuan0718 added a commit to zhouguangyuan0718/go that referenced this issue Apr 20, 2022
It's time to begin using the dwarf type info generated by compiler in linker.

For golang#52209

Change-Id: Ic12f4ed1f510454880d6ef9c8608b405c5404a92
zhouguangyuan0718 added a commit to zhouguangyuan0718/go that referenced this issue Apr 20, 2022
Now, all the dwarf type info is generated in compiler, it is not necessary to keep these code in linker, remove it.

For golang#52209

Change-Id: I9cfe253bd4b4ea7d93d0c4cfbf6d309f566f4552
zhouguangyuan0718 added a commit to zhouguangyuan0718/go that referenced this issue Apr 20, 2022
…fo when -linkshared

Now, all the dwarf type can be generated by compiler, when using static link, every compiler unit generate its own dwarf type die. when using -linkshared, all the types current compile unit need can be collected. So we can generate corresponding dwarf type info for it and subtypes of it.
For synthesizetype, we also need to keep the type info of them when linkshared

For golang#52209

Change-Id: Ia70ba39708ce710c45d923d5b86dac91d7e9d616
@thanm
Copy link
Contributor

thanm commented Apr 21, 2022

Thanks. I'll take a look once I clear up some time.

@gopherbot
Copy link

Change https://go.dev/cl/402258 mentions this issue: internal/dwarf: create dupok sym when create internal type

@gopherbot
Copy link

Change https://go.dev/cl/403334 mentions this issue: cmd/compile: only emit basic dwarf type info for runtime package

@thanm
Copy link
Contributor

thanm commented May 4, 2022

I'll take a look once I clear up some time.

Still tied up with pre-release-freeze development work. Hope to look at this next week.

@gopherbot
Copy link

Change https://go.dev/cl/404217 mentions this issue: cmd/compile: add some testcases for dwarf types

@zhouguangyuan0718
Copy link
Contributor Author

I'll take a look once I clear up some time.

Still tied up with pre-release-freeze development work. Hope to look at this next week.

Thanks, and I wonder if it is possible to be merged into 1.19? If it is possible, I will do the remain things as soon as possible.

@gopherbot
Copy link

Change https://go.dev/cl/404734 mentions this issue: cmd/compile: fix interface print in debugger for dynamic link

@gopherbot
Copy link

Change https://go.dev/cl/404754 mentions this issue: misc/cgo/testshared: add testcases for dwarf info of buildmode shared

@gopherbot
Copy link

Change https://go.dev/cl/404755 mentions this issue: cmd/link: add testcases for dwarf of linkshared in TestDWARF

@gopherbot
Copy link

Change https://go.dev/cl/405455 mentions this issue: cmd/compile: move constants of type map to objabi

@zhouguangyuan0718
Copy link
Contributor Author

$ gotip build -a -debug-trace old.out cmd/go
$ ~/01.Code/00.godev/go/bin/go build -a -debug-trace new.out cmd/go

new.txt
old.txt

@zhouguangyuan0718
Copy link
Contributor Author

@thanm , Sorry for delay. I have rebased and solved confliction. Then I solved some comments. Can we continue to review these and make it more closer to merging?

@zhouguangyuan0718
Copy link
Contributor Author

@thanm @mdempsky Could we continue to review these and make it more closer to merging? Or which patch do you think that I need to refactor it?

@ajwerner
Copy link

One observation is that all of the types of a go program appear in compile units called runtime. Is there any intention to associate the types with the compile unit that exists for the package? If so, would that be covered by this work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

7 participants