Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crypto/x509: "certificate is not standards compliant" on MacOS #51991

Open
dims opened this issue Mar 28, 2022 · 54 comments
Open

crypto/x509: "certificate is not standards compliant" on MacOS #51991

dims opened this issue Mar 28, 2022 · 54 comments
Assignees
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin
Milestone

Comments

@dims
Copy link

dims commented Mar 28, 2022

We hit an error with a unit test we had in Kubernetes and started looking at the impact on end users of kubernetes if the problem is not resolved by the time kubernetes 1.24 is released. More context: please see Kubernetes issue - kubernetes/kubernetes#108956

What version of Go are you using (go version)?

$ go version
go version go1.18 darwin/arm64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/Users/dims/Library/Caches/go-build"
GOENV="/Users/dims/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/dims/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/dims/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/opt/homebrew/Cellar/go/1.18/libexec"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/opt/homebrew/Cellar/go/1.18/libexec/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="go1.18"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/qw/pkzvlrfs7rn7h6r1x7r57_rw0000gn/T/go-build1513460199=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Please see https://go.dev/play/p/w4rr43vQv7d

What did you expect to see?

success

What did you see instead?

error: x509: “no-sct.badssl.com” certificate is not standards compliant

@dims
Copy link
Author

dims commented Mar 28, 2022

The error seems be introduced here: feb024f#diff-9e2a37df9605e8b207365b51999e6b14e1f5db72b27ad33514dbac502d477c25R212

@liggitt summarized the ask here : kubernetes/kubernetes#108956 (comment) and kubernetes/kubernetes#108956 (comment) on what impact this will have on kubernetes users for whom there was no issues before we switched to go1.17 when they try the kubernetes kubectl command built using go1.18.

Worst case we would like to document scenarios under which users will hit the certificate is not standards compliant error that they were not hitting before.

thanks!

@seankhliao seankhliao added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 28, 2022
@seankhliao
Copy link
Member

cc @golang/security

@FiloSottile
Copy link
Contributor

Can you share the actual affected certificate? You are unlikely to be hitting the same root cause as no-sct.badssl.com in a unit test, because SCT rules are only enforced for WebPKI certificates.

@liggitt
Copy link
Contributor

liggitt commented Mar 28, 2022

We only noticed the issue in a unit test checking a cert we expected to be considered invalid (it was a negative test of a cert not signed by a trusted root), so at first we thought we just needed to make the unit test more tolerant of various error messages.

In spelunking around the change in the message, we also ran across https://groups.google.com/g/golang-nuts/c/RGghq2gTWss/m/7GsudTfCAgAJ which indicated requests that previously succeeded were now failing.

Before papering over the change in our unit test by tolerating more validation error messages, I wanted to understand more about which certificates the go validator considers valid that the macOS validator does not. (kubernetes/kubernetes#108956 (comment))

@rolandshoemaker
Copy link
Member

Apple enforces their SCT requirements on all publicly trusted certificates as part of its base TLS policy (which we use via SecPolicyCreateSSL, since we are generally targeting the web PKI.) Publicly trusted certificates that lack embedded SCTs are very rare, making up something like 0.01% of all publicly trusted certs, but they are out there (the AWS example being probably the most common.)

This will only affect users who are using the bare system certificate pool and are validating certificates which totally lack embedded SCT, with the server providing them via a TLS extension.

It is noted in a TODO in crypto/x509/root_darwin.go that we may want to support passing SCTs passed this way, since we have no way of telling Apple to disable this particular policy (I would need to double check, but it's possible these requirements are not enforced if you use SecPolicyCreateBasicX509, but that would likely also disabled all of the other web PKI policies that we want applied), but how to do that is rather nuanced (since we'd only do anything with them on macOS.)

@rolandshoemaker
Copy link
Member

Slight side note: the AWS case is a weird one, because I don't think they are sending SCTs at all, despite using publicly trusted certs, so even implementations that know how to pipe SCTs passed via TLS extensions wouldn't work on macOS 🤷.

@FiloSottile
Copy link
Contributor

I don't think the SCT policy explains the new error in kubernetes/kubernetes#108956 though, because that's not a publicly trusted certificate. If you want to extract that certificate and share it with us, we can tell you why that one is failing, too.

kubernetes/kubernetes#108956 (comment) papering over the change in our unit test by tolerating more validation error messages, I wanted to understand more about which certificates the go validator considers valid that the macOS validator does not.

We can't really answer this exhaustively, because the macOS verifier has a number of evolving policies that change between OS versions. Note that the platform verifier is only used when the system roots are involved, so behaving like the system is what's expected. I assume k8s clusters use private CAs configured through config.RootCAs for most purposes, which would be unaffected by this.

@FiloSottile
Copy link
Contributor

Slight side note: the AWS case is a weird one, because I don't think they are sending SCTs at all, despite using publicly trusted certs, so even implementations that know how to pipe SCTs passed via TLS extensions wouldn't work on macOS 🤷.

Customers should probably reach out to AWS about this. As a short term workaround, it should be possible to add the Amazon root CAs to a x509.SystemCertPool() and use it as config.RootCAs so that the Go verifier is used as well as the system one. (Don't start with an empty pool so that if the root changes you have a chance at not breaking.)

@liggitt
Copy link
Contributor

liggitt commented Mar 29, 2022

It looks like this change means we no longer get typed TLS errors (e.g. x509.UnknownAuthorityError) when validating using system roots.

That means that special handling of those errors (logging or other fallback paths) that previously worked no longer works in go1.18.

edit: I'll open a separate issue for that, since that is distinct from the "certificate considered valid in go1.17" → "certificate considered invalid in go1.18" issue

@liggitt
Copy link
Contributor

liggitt commented Mar 29, 2022

opened #52010 for the untyped error issue

@liggitt
Copy link
Contributor

liggitt commented Mar 29, 2022

I assume k8s clusters use private CAs configured through config.RootCAs for most purposes, which would be unaffected by this.

I also expect that to be true in most scenarios (and in scenarios where it isn't for the certs issued by public CAs to be compatible with system roots, though the referenced AWS issue is evidence that not all certs issued by public CAs are valid).

For k8s' use, I don't think this issue is very significant.

@rolandshoemaker
Copy link
Member

Oh, that particular test certificate (in TestTLSConfig) is non-compliant in a handful of ways. It's self-signed, but isCA is false, it is missing the cert sign key usage, and it's validity period is likely too long (although I'm not sure if macOS enforces that for self-signed certs.)

@calvinbui
Copy link

We've had the same issue with connecting to AWS Elasticache Redis servers. Amazon will not support SCTs to avoid publishing customer cluster names in a public log. The connection previously worked fine in 1.17.

@jimidle
Copy link

jimidle commented Mar 31, 2022

A little more background about AWS, or at least how we were connecting to the Neptune graph database service. In case it helps anyone.

Because Neptune is a little "light" on security, you can only connect to it through local/private VPC. This isn't very useful for developers, so we have a VPN to a bastion host for a development only instance of Neptune (Neptune does not have any local installation - it is an AWS service only).

It seems that AWS did not feel the need to put any SCTs in to the Neptune cert, thinking it would only see connections from the secured VPC, and so our connections (via go, it is fine from Java for instance) will fail.

We have raised a ticket with AWS about this. There isn't much can be done about that in Go.

As this is a developer only connection, we have created a reverse proxy with a local CA root. This allows the connection for developers. Hokey, but does what we want for a developer connection. The real solution is of course for AWS to re-issue their certificate, however they say they don't want SCTs in order to avoid placing customer cluster names in a public log (see #51991 (comment) )

@bcmills bcmills changed the title crypto/x509: "certificate is not standards compliant" on MacOS only with golang 1.18 crypto/x509: "certificate is not standards compliant" on MacOS Apr 20, 2022
@bcmills bcmills added this to the Go1.19 milestone Apr 20, 2022
@rolandshoemaker
Copy link
Member

As far as I can tell this seems, possibly, (this is unbearably painful to diagnose) to be an issue with 10.15.1, which is what the the darwin-amd64-10_15 builder is running. I suspect that updating the builder to use 10.15.6 would fix this, but I have absolutely no clue how viable that is.

@bcmills
Copy link
Contributor

bcmills commented Apr 20, 2022

I suspect that updating the builder to use 10.15.6 would fix this, but I have absolutely no clue how viable that is.

@golang/release, can you weigh in on that? (How hard is the macOS 10.15 image to update?)

@heschi
Copy link
Contributor

heschi commented Apr 20, 2022

For amd64 I think it's maybe a day's work, if we're willing to cut over all the builders at once. Rolling it out gradually will be more unpleasant. I haven't read this issue to judge whether it's a good use of time.

@rolandshoemaker
Copy link
Member

I don't think there is really any other way to address this issue, given how deeply integrated the TLS client is in the toolchain there isn't really any (safe) way of silently handling/skipping these failures. It's not a high frequency flake though (it seems somewhat correlated with when new certificates are issued) so probably not super high priority.

@FiloSottile
Copy link
Contributor

(it seems somewhat correlated with when new certificates are issued)

Can the machine reach the internet? That sounds consistent with a bloom filter window miss on the Apple Valid system, which leads to an OCSP connection to the CA. If that fails, I could see it leading to a vague error like this.

@jbg
Copy link

jbg commented May 11, 2022

In some cases certificates may be deliberately excluded from CT logs to avoid publishing a detailed map of internal infrastructure. (In our case, the certs are associated with DNS names that are only resolvable internally, and which resolve to private IPs.)

e.g. AWS ACM allows disabling CT logging for this purpose, which will result in a valid certificate issued by a trusted CA but not listed in CT logs.

When trying to access a service with such a cert from Go (in our case, using a Terraform provider) on developer (darwin_arm64) machines, we get this certificate is not standards compliant error.

Is there any solution other than logging the certs? Is there any knob in the Go TLS client for turning off the check, which the TF provider could provide a config option to turn?

@jimidle
Copy link

jimidle commented May 11, 2022 via email

@jameskilroe
Copy link

I am using go version go1.20.1 darwin/arm64 and MacOS Version 13.2.1 (22D68) and this issue is now occurring when I use github.com/gorilla/websocket .Dial() function.

The error is : tls: failed to verify certificate: x509: “*.exchange.coinbase.com” certificate is not standards compliant"

When is this scheduled to be resolved? Is there any simple workaround if accessing a public site where one has no control over the certificate?

Any help greatly appreciated!

@AaronFriel
Copy link

AaronFriel commented Feb 15, 2023

@jameskilroe hey! we at Pulumi dug deep into this after seeing a similar issue. You should be able to resolve it by restarting your machine. We found that should almost always fix the issue.

Workaround

We found that in every case where a user reported this issue, either of these were true:

  1. The machine either had been recently reimaged, and not rebooted since.
  2. The user had not restarted the machine in a substantial period of time.

In both cases, restarting the OS was the workaround.


Analysis

I'm going to summarize my colleague @kmosher's analysis. In recent years Apple, Google, and others have added the following requirements for a TLS cert to be considered valid
:

On macOS, and in Safari and Go programs, the system service trustd is responsible for checking a certificate against certificate transparency (CT) log information cached by the OS.

Due to unknown reasons, trustd does not update the list of trusted certificate transparency (CT) logs it uses while the system is running. As a result, certificates signed against chains that are currently trusted (and may be listed in the file below) aren't considered valid until a restart.

The trusted certificate transparency logs on macOS can be located here:

/System/Library/Security/Certificates.bundle/Contents/Resources/TrustedCTLogs.plist

And Apple publishes a JSON document for the OS to update from here:

https://valid.apple.com/ct/log_list/current_log_list.json

@AaronFriel
Copy link

@rsc, @ianlancetaylor: Based on the above, I believe this issue can be closed - or perhaps moved to a Discussion with an accepted answer. I believe that someone should file a report with Apple's bug reporting tool Feedback Assistant to resolve this behavior with macOS if it hasn't been. This issue may already be fixed and released, but as we describe above, users who encounter the issue are likely to be on out of date machines that have not updated or even restarted their machine recently.

@jameskilroe
Copy link

Hi @AaronFriel

Thanks for the answer. I updated my machine to the latest OS to try and solve the problem and did restart my machine a few time this morning, but unfortunately, the problem persists.

I did notice that the files in my TrustedCTLogs.plist were last updated on 9 Feb (before my update). Is there any chance you know how to force and update? I did some googling but no obvious answers.

All round very frustrating!

@jbg
Copy link

jbg commented Feb 16, 2023

@jameskilroe rather than looking at the date, diff it against the current list (https://valid.apple.com/ct/log_list/current_log_list.json). The set of valid CT logs doesn't change that often (last change 15 Jan). If you have the latest list and have rebooted recently, then you may have a different issue (like the certificate is not actually present in the CT logs — you can check this — or is valid for too long).

jsoriano added a commit to elastic/elastic-package that referenced this issue Feb 21, 2023
Clients that rely on OSX APIs for certificate validation may find an error
with the message "certificate is not standards compliant" with certificates
that don't comply with Apple rules for certificate validation. When this error
happens, the actual reason for each certificate is not exposed, and it
seems to happen with certificates that should be valid in the context of
the validation.
More discussion about this can be found in golang/go#51991.

This happens with certificates generated by `elastic-package`, clients
sometimes report this error with configurations that otherwise should accept
these certificates.

According to this post, one of the rules is that certificates cannot be valid for
more than 825 days.
https://rahulkj.github.io/openssl,/certificates/2022/09/09/self-signed-certificates.html

Reduce the expiration time to try to reduce the chances of triggering this error.
@gopherbot gopherbot modified the milestones: Go1.21, Go1.22 Aug 8, 2023
@andresvia
Copy link

I'm sad to report that restarting my M1 also didn't worked for me.

@sandheepp
Copy link

tls: failed to verify certificate: SecPolicyCreateSSL error: 0
I am using aws serverless offline and trying to hit an endpoint, which is also creating issues with Mac M2. This basically makes it very inconvenient for local development after the latest update from apple.

@jimidle
Copy link

jimidle commented Sep 15, 2023 via email

@jimidle
Copy link

jimidle commented Sep 15, 2023 via email

raghavendra-talur added a commit to raghavendra-talur/ramen that referenced this issue Sep 19, 2023
On Mac, the check for certs is more strict and it fails for submariner
service. Turning off the check for certs.

More info: golang/go#51991

Signed-off-by: Raghavendra Talur <raghavendra.talur@gmail.com>
raghavendra-talur added a commit to raghavendra-talur/ramen that referenced this issue Sep 25, 2023
On Mac, the check for certs is more strict and it fails for submariner
service. Turning off the check for certs.

More info: golang/go#51991

Signed-off-by: Raghavendra Talur <raghavendra.talur@gmail.com>
raghavendra-talur added a commit to raghavendra-talur/ramen that referenced this issue Sep 25, 2023
On Mac, the check for certs is more strict and it fails for submariner
service. Turning off the check for certs.

More info: golang/go#51991

Signed-off-by: Raghavendra Talur <raghavendra.talur@gmail.com>
raghavendra-talur added a commit to raghavendra-talur/ramen that referenced this issue Sep 25, 2023
On Mac, the check for certs is more strict and it fails for submariner
service. Turning off the check for certs.

More info: golang/go#51991

Signed-off-by: Raghavendra Talur <raghavendra.talur@gmail.com>
raghavendra-talur added a commit to raghavendra-talur/ramen that referenced this issue Sep 29, 2023
On Mac, the check for certs is more strict and it fails for submariner
service. Turning off the check for certs.

More info: golang/go#51991

Signed-off-by: Raghavendra Talur <raghavendra.talur@gmail.com>
@gopherbot gopherbot modified the milestones: Go1.22, Go1.23 Feb 6, 2024
@SaberStrat
Copy link

Rebooting didn't help on my M3 Pro either unfortunately. Was still getting that error with our company's self-signed cert.

Went with the workaround of using a linux container with go in it.

@Ademord
Copy link

Ademord commented Apr 2, 2024

Can someone explain maybe also why is this problem happening? I get it when i do "oc login".

@jboykin-bread
Copy link

jboykin-bread commented Apr 16, 2024

We hit this trying to use mirrord with some of our services. For folks trying to work with AWS services that don't use SCTs in their certs locally on macOS, here's a temporary workaround that's working for us.

  1. Download one of AWS's intermediate or Root CAs that issued the problematic cert lacking an SCT. For us to work with ElastiCache, that ended up being this cert: https://www.amazontrust.com/repository/G2-RootCA1.pem
  2. Import the pem file into your macOS keychain, and set the trust policy to "Always Trust" (reference).

The call to SecTrustEvaluateWithError should now pass (https://github.com/golang/go/blob/master/src/crypto/x509/internal/macos/security.go#L200-L215).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin
Projects
None yet
Development

No branches or pull requests