Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plugin: Can't load (plugin.Open) multiple plugins with the same func/symbol types #32923

Closed
cmitchell opened this issue Jul 3, 2019 · 2 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.

Comments

@cmitchell
Copy link

What version of Go are you using (go version)?

$ go version
1.12.6

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

Alpine v3.10 running in docker (this setup works fine with the previous version of go used, 1.11.5)

go env Output
$ go env

What did you do?

Upgraded to Alpine v3.10, which includes go1.12.6 (From go1.11.5)

We have a list of plugins, the location of which (if they are to be loaded) are defined by a configuration file. Each plugin has the same set of functions so that the main program can invoke them in a loop, without knowing which plugin is executing it. I.e., the functions are

func Init(configFile string) (bool, string)
func EnrichResults(toEnrich []map[string]interface{}, secTokens []string, applicableFields []string)

which get invoked by the main program via

// executeEnrichmentPlugins executes any enrichment plugins that the system knows about.
func (qf *queryFederator) executeEnrichmentPlugins(user *common.User, results []map[string]interface{}, enrichmentFields []string) {
	for _, idPlugin := range qf.enrichments {
		for _, plug := range idPlugin {
			plug(results, user.Tokens, enrichmentFields)
		}
	}
}

The plugins are being loaded by the main program as such

func loadEnrichmentPlugins(conf *metasearchd.Metasearchd,
	enrichmentPlugs *[]map[string]func([]map[string]interface{}, []string, []string),
	metadata map[string]map[string]string, loadingWg *sync.WaitGroup) {

	var enrichWg sync.WaitGroup
	enrichWg.Add(len(conf.IdentifierPlugin))

	go func() {
		enrichWg.Wait()
		log.Infoln("***Loading enrichment plugins complete***")
		loadingWg.Done()
	}()

	epMutex := &sync.RWMutex{}

	// Load enricher plugins
	log.Infoln("***Loading enrichment plugins... ***")

	for _, pMap := range conf.EnrichmentPlugin {

		// Let each plugin startup in its own thread so that if one plugin has a long start up time,
		// all the other plugins aren't held up from starting
		go func(pMap map[string]interface{}) {
			defer enrichWg.Done()
			pluginMetadata := make(map[string]string)
			filename := pMap["path"].(string)

			p, err := plugin.Open(filename)
			if err != nil {
				log.Errorf("Plugin named %s with path of %s defined, but .so not found. Skipping", pMap["id"].(string), filename)
				return
			}

			symbol, err := p.Lookup("EnrichResults")
			if err != nil {
				log.Error(err)
				return
			}

			pluginFunc, ok := symbol.(func([]map[string]interface{}, []string, []string))
			if !ok {
				log.Errorf("Plugin %s has no 'EnrichResults([]map[string]interface{}, []string, []string)' function", pMap["id"].(string))
				return
			}

			initSymbol, err := p.Lookup("Init")
			if err != nil {
				log.Error(err)
				return
			}

			initFunc, ok := initSymbol.(func(string) (bool, string))
			if !ok {
				log.Errorf("Plugin %s has no 'Init(string) (bool, string)' function", pMap["id"].(string))
				return
			}

			success, version := initFunc(pMap["config"].(string))
			if success {
				log.Infof("%s plugin v%s loaded", pMap["id"], version)

				epMutex.Lock()
				*enrichmentPlugs = append(*enrichmentPlugs, map[string]func([]map[string]interface{}, []string, []string){pMap["id"].(string): pluginFunc})
				pluginMetadata["legend"] = pMap["legend"].(string)
				pluginMetadata["indicator"] = pMap["indicator"].(string)
				metadata[pMap["id"].(string)] = pluginMetadata
				epMutex.Unlock()

			} else {
				log.Errorf("%s plugin v%s could not be loaded.", pMap["id"], version)
			}
		}(pMap)
	}
}

The code was not changed between upgrading golang versions. We have not switched to using go modules. We build both the main program and the plugins in the same docker image (which is now Alpine 3.10 based)
Build command for the main program

GOPATH="$(GOPATH)" GOOS=linux go build -tags "json1 fts5" -o ../bin/metasearchd metadata-search-service/metasearchd/metasearchd

Build command for the plugins

GOPATH="$(GOPATH)" GOOS=linux go build -ldflags "-X main.Version=${VERSION}" -buildmode=plugin -o ../../bin/plugins/rule_enricher_plugin.so metadata-search-service/plugins/rule_enricher_plugin

What did you expect to see?

The plugins all being loaded, as happens in go1.11.5

What did you see instead?

The first plugin will load fine. In fact, if we only try to load one plugin, it always loads. However, if we try to load multiple plugins with different names, but the same functions as described above, it would result in a segmentation fault 90% of the time. It wasn't every time, but restarting the image would cause the fault if it didn't happen the very first spin up (which sometimes it did) The part of the stack trace that was the same each time is as follows

unexpected fault address 0x8000
metadata-search-service_1                   | fatal error: fault
metadata-search-service_1                   | [signal SIGSEGV: segmentation violation code=0x1 addr=0x8000 pc=0x8000]
metadata-search-service_1                   | 
metadata-search-service_1                   | goroutine 366 [running]:
metadata-search-service_1                   | runtime.throw(0x55e2cb28a7ec, 0x5)
metadata-search-service_1                   | 	/usr/lib/go/src/runtime/panic.go:617 +0x74 fp=0xc0006ebdf8 sp=0xc0006ebdc8 pc=0x55e2ca74de44
metadata-search-service_1                   | runtime: unexpected return pc for runtime.sigpanic called from 0x8000
metadata-search-service_1                   | stack: frame={sp:0xc0006ebdf8, fp:0xc0006ebe28} stack=[0xc0006ea000,0xc0006ec000)
metadata-search-service_1                   | 000000c0006ebcf8:  000000c0006ebde2  000000c0006ebd38 
metadata-search-service_1                   | 000000c0006ebd08:  000055e2ca74f570 <runtime.recordForPanic+304>  000055e2cc37e85f 
metadata-search-service_1                   | 000000c0006ebd18:  000055e2cb288034  0000000000000001 
metadata-search-service_1                   | 000000c0006ebd28:  0000000000000001  0000000000000000 
metadata-search-service_1                   | 000000c0006ebd38:  000000c0006ebd78  000055e2ca74f726 <runtime.gwrite+166> 
metadata-search-service_1                   | 000000c0006ebd48:  0000000000000002  000055e2cb288034 
metadata-search-service_1                   | 000000c0006ebd58:  0000000000000001  000055e200000001 
metadata-search-service_1                   | 000000c0006ebd68:  000000c0006ebde2  0000000000000006 
metadata-search-service_1                   | 000000c0006ebd78:  000000c0006ebdc8  000055e2ca74ff4c <runtime.printstring+124> 
metadata-search-service_1                   | 000000c0006ebd88:  000055e2ca74e019 <runtime.fatalthrow+89>  000000c0006ebd98 
metadata-search-service_1                   | 000000c0006ebd98:  000055e2ca778050 <runtime.fatalthrow.func1+0>  000000c000483380 
metadata-search-service_1                   | 000000c0unexpected fault address 0x8000
metadata-search-service_1                   | fatal error: fault
metadata-search-service_1                   | [signal SIGSEGV: segmentation violation code=0x1 addr=0x8000 pc=0x8000]
metadata-search-service_1                   | 
metadata-search-service_1                   | goroutine 366 [running]:
metadata-search-service_1                   | runtime.throw(0x55e2cb28a7ec, 0x5)
metadata-search-service_1                   | 	/usr/lib/go/src/runtime/panic.go:617 +0x74 fp=0xc0006ebdf8 sp=0xc0006ebdc8 pc=0x55e2ca74de44
metadata-search-service_1                   | runtime: unexpected return pc for runtime.sigpanic called from 0x8000
metadata-search-service_1                   | stack: frame={sp:0xc0006ebdf8, fp:0xc0006ebe28} stack=[0xc0006ea000,0xc0006ec000)
metadata-search-service_1                   | 000000c0006ebcf8:  000000c0006ebde2  000000c0006ebd38 
metadata-search-service_1                   | 000000c0006ebd08:  000055e2ca74f570 <runtime.recordForPanic+304>  000055e2cc37e85f 
metadata-search-service_1                   | 000000c0006ebd18:  000055e2cb288034  0000000000000001 
metadata-search-service_1                   | 000000c0006ebd28:  0000000000000001  0000000000000000 
metadata-search-service_1                   | 000000c0006ebd38:  000000c0006ebd78  000055e2ca74f726 <runtime.gwrite+166> 
metadata-search-service_1                   | 000000c0006ebd48:  0000000000000002  000055e2cb288034 
metadata-search-service_1                   | 000000c0006ebd58:  0000000000000001  000055e200000001 
metadata-search-service_1                   | 000000c0006ebd68:  000000c0006ebde2  0000000000000006 
metadata-search-service_1                   | 000000c0006ebd78:  000000c0006ebdc8  000055e2ca74ff4c <runtime.printstring+124> 
metadata-search-service_1                   | 000000c0006ebd88:  000055e2ca74e019 <runtime.fatalthrow+89>  000000c0006ebd98 
metadata-search-service_1                   | 000000c0006ebd98:  000055e2ca778050 <runtime.fatalthrow.func1+0>  000000c000483380 
metadata-search-service_1                   | 000000c0006ebda8:  000055e2ca74de44 <runtime.throw+116>  000000c0006ebdc8 
metadata-search-service_1                   | 000000c0006ebdb8:  000000c0006ebde8  000055e2ca74de44 <runtime.throw+116> 
metadata-search-service_1                   | 000000c0006ebdc8:  000000c0006ebdd0  000055e2ca777fc0 <runtime.throw.func1+0> 
metadata-search-service_1                   | 000000c0006ebdd8:  000055e2cb28a7ec  0000000000000005 
metadata-search-service_1                   | 000000c0006ebde8:  000000c0006ebe18  000055e2ca7637d5 <runtime.sigpanic+1029> 
metadata-search-service_1                   | 000000c0006ebdf8: <000055e2cb28a7ec  0000000000000005 
metadata-search-service_1                   | 000000c0006ebe08:  0000000000000000  0000000000008000 
metadata-search-service_1                   | 000000c0006ebe18:  000000c000000000 !0000000000008000 
metadata-search-service_1                   | 000000c0006ebe28: >0000000000000000  0000000000000000 
metadata-search-service_1                   | 000000c0006ebe38:  0000000000000000  0000000000000206 
metadata-search-service_1                   | 000000c0006ebe48:  ffffffffffffffff  0000000000000027 
metadata-search-service_1                   | 000000c0006ebe58:  000000c00068b320  0000000000000001 
metadata-search-service_1                   | 000000c0006ebe68:  0000000000000001  0000000000000000 
metadata-search-service_1                   | 000000c0006ebe78:  0000000000000000  0000000000000000 
metadata-search-service_1                   | 000000c0006ebe88:  0000000000000000  000000000000003e 
metadata-search-service_1                   | 000000c0006ebe98:  000055e2ca7c920b <syscall.RawSyscall+43>  000000c0006ec4b8 
metadata-search-service_1                   | 000000c0006ebea8:  000055e2ca7c920b <syscall.RawSyscall+43>  0000000000000206 
metadata-search-service_1                   | 000000c0006ebeb8:  002b000000000033  0000000000000000 
metadata-search-service_1                   | 000000c0006ebec8:  0000000000000000  0000000000000000 
metadata-search-service_1                   | 000000c0006ebed8:  0000000000000000  000000c0006ebfc0 
metadata-search-service_1                   | 000000c0006ebee8:  000055e2ca8dca3d <net.(*netFD).connect.func2+237>  0000000000000003 
metadata-search-service_1                   | 000000c0006ebef8:  000055e2cba21300  000000000000001d 
metadata-search-service_1                   | 000000c0006ebf08:  000000c00023b080  000000c000328870 
metadata-search-service_1                   | 000000c0006ebf18:  000055e2cb28926c  000000c000600f40 
metadata-search-service_1                   | runtime.sigpanic()
metadata-search-service_1                   | 	/usr/lib/go/src/runtime/signal_unix.go:397 +0x405 fp=0xc0006ebe28 sp=0xc0006ebdf8 pc=0x55e2ca7637d5
metadata-search-service_1                   | created by internal/singleflight.(*Group).DoChan
metadata-search-service_1                   | 	/usr/lib/go/src/internal/singleflight/singleflight.go:88 +0x29f
metadata-search-service_1                   | 
metadata-search-service_1                   | goroutine 1 [IO wait]:
metadata-search-service_1                   | internal/poll.runtime_pollWait(0x7f82e5e1fa28, 0x72, 0x0)
metadata-search-service_1                   | 	/usr/lib/go/src/runtime/netpoll.go:182 +0x58
metadata-search-service_1                   | internal/poll.(*pollDesc).wait(0xc0006b0098, 0x72, 0x0, 0x0, 0x55e2cb28cc4a)
metadata-search-service_1                   | 	/usr/lib/go/src/internal/poll/fd_poll_runtime.go:87 +0x9d
metadata-search-service_1                   | internal/poll.(*pollDesc).waitRead(...)
metadata-search-service_1                   | 	/usr/lib/go/src/internal/poll/fd_poll_runtime.go:92
metadata-search-service_1                   | internal/poll.(*FD).Accept(0xc0006b0080, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
metadata-search-service_1                   | 	/usr/lib/go/src/internal/poll/fd_unix.go:384 +0x1bc
metadata-search-service_1                   | net.(*netFD).accept(0xc0006b0080, 0x55e2cb966140, 0x50, 0xc0006bc0f0)
metadata-search-service_1                   | 	/usr/lib/go/src/net/fd_unix.go:238 +0x44
metadata-search-service_1                   | net.(*TCPListener).accept(0xc0006be010, 0xc000498000, 0x7f82e8151498, 0xc000000180)
metadata-search-service_1                   | 	/usr/lib/go/src/net/tcpsock_posix.go:139 +0x34
metadata-search-service_1                   | net.(*TCPListener).AcceptTCP(0xc0006be010, 0x55e2ca72cb4a, 0x30, 0x55e2cb966140)
metadata-search-service_1                   | 	/usr/lib/go/src/net/tcpsock.go:247 +0x4a
metadata-search-service_1                   | net/http.tcpKeepAliveListener.Accept(0xc0006be010, 0x55e2cb966140, 0xc00068a1e0, 0x55e2cb8c39c0, 0x55e2cc184230)
metadata-search-service_1                   | 	/usr/lib/go/src/net/http/server.go:3264 +0x31
metadata-search-service_1                   | net/http.(*Server).Serve(0xc0001c5e10, 0x55e2cba205c0, 0xc0006be010, 0x0, 0x0)
metadata-search-service_1                   | 	/usr/lib/go/src/net/http/server.go:2859 +0x22f
metadata-search-service_1                   | net/http.(*Server).ListenAndServe(0xc0001c5e10, 0x1, 0x0)
metadata-search-service_1                   | 	/usr/lib/go/src/net/http/server.go:2797 +0xe6
metadata-search-service_1                   | main.main()
metadata-search-service_1                   | 	/go/src/metadata-search-service/metasearchd/metasearchd/main.go:183 +0x18a0
006ebda8:  000055e2ca74de44 <runtime.throw+116>  000000c0006ebdc8 
metadata-search-service_1                   | 000000c0006ebdb8:  000000c0006ebde8  000055e2ca74de44 <runtime.throw+116> 
metadata-search-service_1                   | 000000c0006ebdc8:  000000c0006ebdd0  000055e2ca777fc0 <runtime.throw.func1+0> 
metadata-search-service_1                   | 000000c0006ebdd8:  000055e2cb28a7ec  0000000000000005 
metadata-search-service_1                   | 000000c0006ebde8:  000000c0006ebe18  000055e2ca7637d5 <runtime.sigpanic+1029> 
metadata-search-service_1                   | 000000c0006ebdf8: <000055e2cb28a7ec  0000000000000005 
metadata-search-service_1                   | 000000c0006ebe08:  0000000000000000  0000000000008000 
metadata-search-service_1                   | 000000c0006ebe18:  000000c000000000 !0000000000008000 
metadata-search-service_1                   | 000000c0006ebe28: >0000000000000000  0000000000000000 
metadata-search-service_1                   | 000000c0006ebe38:  0000000000000000  0000000000000206 
metadata-search-service_1                   | 000000c0006ebe48:  ffffffffffffffff  0000000000000027 
metadata-search-service_1                   | 000000c0006ebe58:  000000c00068b320  0000000000000001 
metadata-search-service_1                   | 000000c0006ebe68:  0000000000000001  0000000000000000 
metadata-search-service_1                   | 000000c0006ebe78:  0000000000000000  0000000000000000 
metadata-search-service_1                   | 000000c0006ebe88:  0000000000000000  000000000000003e 
metadata-search-service_1                   | 000000c0006ebe98:  000055e2ca7c920b <syscall.RawSyscall+43>  000000c0006ec4b8 
metadata-search-service_1                   | 000000c0006ebea8:  000055e2ca7c920b <syscall.RawSyscall+43>  0000000000000206 
metadata-search-service_1                   | 000000c0006ebeb8:  002b000000000033  0000000000000000 
metadata-search-service_1                   | 000000c0006ebec8:  0000000000000000  0000000000000000 
metadata-search-service_1                   | 000000c0006ebed8:  0000000000000000  000000c0006ebfc0 
metadata-search-service_1                   | 000000c0006ebee8:  000055e2ca8dca3d <net.(*netFD).connect.func2+237>  0000000000000003 
metadata-search-service_1                   | 000000c0006ebef8:  000055e2cba21300  000000000000001d 
metadata-search-service_1                   | 000000c0006ebf08:  000000c00023b080  000000c000328870 
metadata-search-service_1                   | 000000c0006ebf18:  000055e2cb28926c  000000c000600f40 
metadata-search-service_1                   | runtime.sigpanic()
metadata-search-service_1                   | 	/usr/lib/go/src/runtime/signal_unix.go:397 +0x405 fp=0xc0006ebe28 sp=0xc0006ebdf8 pc=0x55e2ca7637d5
metadata-search-service_1                   | created by internal/singleflight.(*Group).DoChan
metadata-search-service_1                   | 	/usr/lib/go/src/internal/singleflight/singleflight.go:88 +0x29f
metadata-search-service_1                   | 
metadata-search-service_1                   | goroutine 1 [IO wait]:
metadata-search-service_1                   | internal/poll.runtime_pollWait(0x7f82e5e1fa28, 0x72, 0x0)
metadata-search-service_1                   | 	/usr/lib/go/src/runtime/netpoll.go:182 +0x58
metadata-search-service_1                   | internal/poll.(*pollDesc).wait(0xc0006b0098, 0x72, 0x0, 0x0, 0x55e2cb28cc4a)
metadata-search-service_1                   | 	/usr/lib/go/src/internal/poll/fd_poll_runtime.go:87 +0x9d
metadata-search-service_1                   | internal/poll.(*pollDesc).waitRead(...)
metadata-search-service_1                   | 	/usr/lib/go/src/internal/poll/fd_poll_runtime.go:92
metadata-search-service_1                   | internal/poll.(*FD).Accept(0xc0006b0080, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
metadata-search-service_1                   | 	/usr/lib/go/src/internal/poll/fd_unix.go:384 +0x1bc
metadata-search-service_1                   | net.(*netFD).accept(0xc0006b0080, 0x55e2cb966140, 0x50, 0xc0006bc0f0)
metadata-search-service_1                   | 	/usr/lib/go/src/net/fd_unix.go:238 +0x44
metadata-search-service_1                   | net.(*TCPListener).accept(0xc0006be010, 0xc000498000, 0x7f82e8151498, 0xc000000180)
metadata-search-service_1                   | 	/usr/lib/go/src/net/tcpsock_posix.go:139 +0x34
metadata-search-service_1                   | net.(*TCPListener).AcceptTCP(0xc0006be010, 0x55e2ca72cb4a, 0x30, 0x55e2cb966140)
metadata-search-service_1                   | 	/usr/lib/go/src/net/tcpsock.go:247 +0x4a
metadata-search-service_1                   | net/http.tcpKeepAliveListener.Accept(0xc0006be010, 0x55e2cb966140, 0xc00068a1e0, 0x55e2cb8c39c0, 0x55e2cc184230)
metadata-search-service_1                   | 	/usr/lib/go/src/net/http/server.go:3264 +0x31
metadata-search-service_1                   | net/http.(*Server).Serve(0xc0001c5e10, 0x55e2cba205c0, 0xc0006be010, 0x0, 0x0)
metadata-search-service_1                   | 	/usr/lib/go/src/net/http/server.go:2859 +0x22f
metadata-search-service_1                   | net/http.(*Server).ListenAndServe(0xc0001c5e10, 0x1, 0x0)
metadata-search-service_1                   | 	/usr/lib/go/src/net/http/server.go:2797 +0xe6
metadata-search-service_1                   | main.main()
metadata-search-service_1                   | 	/go/src/metadata-search-service/metasearchd/metasearchd/main.go:183 +0x18a0

I suspect that the issue comes from the typeslinkinit() in type.go, where it's trying to be smart about using previous types that it's seen, but I'm not 100% sure. I just know that unfortunately for now, we'll have to stay at go1.11.5 unless there's a resolution or workaround for this issue. Happy to provide more information in order to help you help us.

@randall77
Copy link
Contributor

I am unable to reproduce this with a simplified example
I'm not terribly surprised by a bug here, as plugin loading from multiple goroutines at once is nonstandard (and, I think, not tested).
Looking at the code, there's some synchronization in parts of the plugin loading path (e.g. for itab adding), but not for other parts. It would take some more work to investigate where additional synchronization is needed. It might be easiest to just wrap the whole thing in a lock.

A few things that would help, if you have the time:

  1. Make a small reproducer for us. I tried a simple example of loading multiple packages in parallel, but that didn't seem to trigger the issue by itself.
  2. Try modifying your code to load the packages in series instead of in parallel, and see if that fixes the issue.
  3. Try modifying the runtime to put a lock around the body of plugin_lastmoduleinit and/or chunks of that body, to see if that fixes the issue.

@bcmills bcmills added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Jul 8, 2019
@gopherbot
Copy link

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@golang golang locked and limited conversation to collaborators Aug 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

4 participants