Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/vuln: make progress-like output opt-in for -json flag #60497

Closed
VirrageS opened this issue May 30, 2023 · 9 comments
Closed

x/vuln: make progress-like output opt-in for -json flag #60497

VirrageS opened this issue May 30, 2023 · 9 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. vulncheck or vulndb Issues for the x/vuln or x/vulndb repo WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.

Comments

@VirrageS
Copy link

What version of Go are you using (go version)?

$ go version
go1.20.4

Does this issue reproduce at the latest version of golang.org/x/vuln?

Yes

What did you do?

govulncheck -json

What did you expect to see?

Currently, the nature of -json flag is that it emits progress-like output. This is extremely hard to parse by other tools, because many of them don't support multiple JSON objects defined one after another. I think the progress-like output should be opt-in instead of default value (at least for -json flag). I feel like this would improve handling certain scenario.

To give concrete example is that I would envision using job where I run govulncheck -json > vulns.json and then pass this vulns.json artifact file to next job that would do something with it. But then this other job would need to parse and omit all the objects which don't match certain structure. For example it is obsolete to parse:

{
  "progress": {
    "message": "Scanning your code and 625 packages across 64 dependent modules for known vulnerabilities..."
  }
}

because this is not related to the vulnerabilities itself.

What did you see instead?

I would like to see just one structure that could be parsed and is emitted after all the checks/passes are done. For example instead of:

{
  "config": {
    "version": "govulncheck@v0.0.0",
    "data_source": "https://vuln.go.dev",
    "last_modified": "2023-05-24T18:13:11Z",
    "go_version": "go1.20.4"
  }
}
{
  "progress": {
    "message": "Scanning your code and 625 packages across 64 dependent modules for known vulnerabilities..."
  }
}
{
  "vulnerability": {
    ...
  }
}

do:

{
  "config": {
    "version": "govulncheck@v0.0.0",
    "data_source": "https://vuln.go.dev",
    "last_modified": "2023-05-24T18:13:11Z",
    "go_version": "go1.20.4"
  },
  "progress": {
    "message": "Scanning your code and 625 packages across 64 dependent modules for known vulnerabilities..."
  },
  "vulnerabilities": [
    {
      ...
    }
  ]
}
@VirrageS VirrageS added the vulncheck or vulndb Issues for the x/vuln or x/vulndb repo label May 30, 2023
@gopherbot gopherbot modified the milestones: Unreleased, vuln/unplanned May 30, 2023
@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 30, 2023
@mknyszek
Copy link
Contributor

CC @golang/vulndb

@ianthehat
Copy link

Are there any particular tools you have in mind? All the existing tools that we were thinking about handle json streams, and if you are writing a new tool we provide some very simple code to handle it.
You can always convert it to an array using something like jq -s '.' if you have an unusual use case that really needs a non stream version

@seankhliao seankhliao added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label May 30, 2023
@VirrageS
Copy link
Author

VirrageS commented May 31, 2023

Maybe "tools" was not the correct word. You are right jq could help with processing the output. But the problem happens when you try to for example read this file with:

data, _ := os.ReadFile(...)
var output any
err := json.Unmarshal(data, &output)
fmt.Println(err)

This would produce error invalid character '{' after top-level value.

One could use json.Decoder (as suggested for example in https://stackoverflow.com/a/50352805) but the thing is that you would need to handle 3 different structs (at least for now).

That's why I'm saying having some kind of "wrapper" struct over all of this would solve the problem because I could read everything and then focus on the field that I'm most interested. Right now, there are options to deal with it, like jq or json.Decoder and probably some more, but I would love to see that working out of a box.

@VirrageS
Copy link
Author

Another thing, is that even if you are relaying on jq -s '.' then you get array of objects. Then you need to filter out things that match or not match the object that you are interested in particular (worse if you rely on index in this array). And what happens if govulncheck decides to change the order or add something in between? Then suddenly your script could start failing. If we would have well defined structure adding new field would not affect any existing tools/scripts.

@VirrageS
Copy link
Author

To list all findings you need to do govulncheck -json ./... | jq -s '.[].finding | select(. != null)' which in my opinion is not easy. And you need to have jq in the first place.

@ianthehat
Copy link

Easy is not really the goal of the format, powerful and flexible for writing tools on top of it is.
The normal way to process the stream is using code like internal/govulncheck/handler.go
That is the code used by govulncheck itself when converting from json to text, and that package is intended to be code you can copy paste into your own tools. It is not a lot of code.
govulncheck makes no promises at all about the ordering of messages in the output, or how many it will emit (even when scanning the same code with the same database, because it is using heuristics to pick representative call paths), so you can never safely index into the array.

@VirrageS
Copy link
Author

Easy is not really the goal of the format, powerful and flexible for writing tools on top of it is.

As I agree with this statement, I don't fully understand how this applies to my proposal. Could you give me some use-case when the streaming format is more "powerful and flexible" comparing to my static definition? Also not sure if this is good comparison in the first place since they have their different purposes.

Also, as I mentioned, I don't really want to drop the streaming, I just want to add an option to disable it.

That is the code used by govulncheck itself when converting from json to text, and that package is intended to be code you can copy paste into your own tools. It is not a lot of code.

But could be less :D Also, you already have Message defined which is almost exactly what we need https://go.googlesource.com/vuln/+/master/internal/govulncheck/govulncheck.go#21 we just need to make some of the fields arrays instead of just single values.

@ianthehat
Copy link

Streaming is useful for cases where the scanning is going to take a long time.
For instance, you can emit the progress messages that tell you how much work it thinks it is going to have to do, so the user has some idea how long it is going to take, or you can abort early on the first finding if you are just trying to determine vulnerable or not.
Streams can also be merged, if you concatenate two streams, the resulting stream is still valid, the same would not be true for a static form (you would have to write the merge tool).
It is easy to see how to build a static form from the streaming one, it is impossible to go the other way round, so we either only have streaming, or we have both streaming and static (which would have to be more code).
Understanding the streaming form is so easy that it seems pointless to add a static form on top of it. If you really want it, you can easily define your own static form and write a small tool that takes the streaming form and makes your static form from it.
The vast majority of uses will have to group or filter the data in some form (by vulnerability, by package etc), which they will do by iterating the findings anyway, so there is no benefit to the static form when it comes to processing the data.

@VirrageS
Copy link
Author

Understanding the streaming form is so easy that it seems pointless to add a static form on top of it. If you really want it, you can easily define your own static form and write a small tool that takes the streaming form and makes your static form from it.

I see that you have pretty strong opinion about that, so I guess then this enhancement proposal is not going to happen :(

@seankhliao seankhliao closed this as not planned Won't fix, can't repro, duplicate, stale Aug 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. vulncheck or vulndb Issues for the x/vuln or x/vulndb repo WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

5 participants