Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crypto/x509: pkix.Name.String() does not preserve correct order #52462

Open
bjanders opened this issue Apr 21, 2022 · 4 comments
Open

crypto/x509: pkix.Name.String() does not preserve correct order #52462

bjanders opened this issue Apr 21, 2022 · 4 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@bjanders
Copy link

bjanders commented Apr 21, 2022

What version of Go are you using (go version)?

$ go version
go version go1.18.1 linux/amd64

Does this issue reproduce with the latest release?

Yes, produces with go version go1.18.1

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOVERSION="go1.18.1"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build281192732=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Create three ceritificates with openssl:

openssl req -new -x509 -subj '/CN=foo/O=bar/C=FI/' -nodes -outform DER -out cert1.crt
openssl req -new -x509 -subj '/O=bar/CN=foo/C=FI/' -nodes -outform DER -out cert2.crt
openssl req -new -x509 -multivalue-rdn -subj '/CN=foo+UID=1234/O=bar/C=FI/' -nodes -outform DER -out cert3.crt

Read in and print the subject name of each certificates with the following program:

package main

import (
	"crypto/x509"
	"fmt"
	"os"
)

func main() {
	if len(os.Args) < 2 {
		fmt.Println("need path to cert")
		os.Exit(1)
	}
	f, err := os.Open(os.Args[1])
	if err != nil {
		panic(err)
	}
	certBytes := make([]byte, 2000)
	n, err := f.Read(certBytes)
	if err != nil {
		panic(err)
	}
	f.Close()
	cert, err := x509.ParseCertificate(certBytes[:n])
	if err != nil {
		panic(err)
	}

	fmt.Println(cert.Subject.String())
}

What did you expect to see?

cert1.crt: CN=foo,O=bar,C=FI

cert2.crt: O=bar,CN=foo,C=FI

cert3.crt: CN=foo+0.9.2342.19200300.100.1.1=#130431323334,O=bar,C=FI

(Or actually for cert3.crt: CN=foo+0.9.2342.19200300.100.1.1=1234,O=bar,C=FI, but that is another issue, see #33093.)

What did you see instead?

cert1.crt: CN=foo,O=bar,C=FI (as expected)

cert2.crt: CN=foo,O=bar,C=FI (order is not maintained)

cert3.crt : CN=foo,O=bar,C=FI,0.9.2342.19200300.100.1.1=#130431323334 (order is not mainainted, multi value RDN lost)

To be fair, the documentation for pkix.Name.String() says:

String returns the string form of n, roughly following the RFC 2253 Distinguished Names syntax.

But I think the "roughly" is a little too liberally interpretted.

RFC2253 2.1 says:

the output consists of the string encodings of each
RelativeDistinguishedName in the RDNSequence (according to 2.2),
starting with the last element of the sequence and moving backwards
toward the first.

So the ordering should be mainainted.

Looking at #39924 I suspect there is a misinterpretation where @FiloSottile says:

RFC 2253 lets us pick any order we like.

However, the quoted text is related to the individual RDN parts, so both the following would be correct for cert3.crt:

CN=foo+0.9.2342.19200300.100.1.1=#130431323334,O=bar,C=FI
0.9.2342.19200300.100.1.1=#130431323334+CN=foo,O=bar,C=FI

But not:

CN=foo,O=bar,C=FI,0.9.2342.19200300.100.1.1=#130431323334

Further and for clarity, looking at the ASN.1 structure of the Subject DN of cert3.crt:

SEQUENCE (3 elem)
      SET (2 elem)
        SEQUENCE (2 elem)
          OBJECT IDENTIFIER 2.5.4.3 commonName
          UTF8String foo
        SEQUENCE (2 elem)
          OBJECT IDENTIFIER 0.9.2342.19200300.100.1.1 userID
          UTF8String 1234
      SET (1 elem)
        SEQUENCE (2 elem)
          OBJECT IDENTIFIER 2.5.4.10 organizationName
          UTF8String bar
      SET (1 elem)
        SEQUENCE (2 elem)
          OBJECT IDENTIFIER 2.5.4.6 countryName
          PrintableString FI

You can see that each RDN is in a sequence (order should be maintained) but the multivalue RDN is a set (order does not matter. you can pick any order). The same goes for cert2.crt where the String() method throws the RDNs around in a fixed orderd and places unrecognized RDNs last.

In addition to this, it looks the String() method prints the sequence in reverse to what RFC 2253 says "starting with the last element of the sequence and moving backwards toward the first", which would suggest cert1.crt should actually print as (contrary to what I said in "What did you expect to see?"):

C=FI,O=bar,CN=foo

countryName is the last element and commonName is the first. RFC 2253 shows examples where the countryName is last, but it does not show the corresponding ASN.1 structure, so that cannot be used as a reference. The ASN.1 sequence in RFC 2253 is probably in reverse to what this example shows.

@thanm thanm added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Apr 22, 2022
@thanm
Copy link
Contributor

thanm commented Apr 22, 2022

@rolandshoemaker per owners

@rolandshoemaker
Copy link
Member

See #40876 for a previous discussion of these issues.

@bjanders
Copy link
Author

I think you need to update the documentation to better describe the behavior of the library. Saying "roughly following the RFC 2253 Distinguished Names syntax" is far too vague and, frankly, quite a stretch. As you see, there are quite many tickets related to this (I checked all open ones before filing this one, but did not see #40876 since it is closed), and this is most likely due to the misleading documentation. The documentation should contain a warning that if you want RFC 2253 (or RFC 4515, which has replaced RFC2253) syntax, then you are better off writing your own library. The way it goes now is that the Go user trusts that Go library works according to the standards (despite the "roughly"). Then you start noticing quite many things are off. Then you start debugging, and eventually spend time submitting a bug report. This then also unnecessarily wastes the Go maintainers time. If the documentation was more accurate a lot of time would be saved. I repeat myself, It really should contain a warning.

Regarding not wanting to fix this in the fear of breaking implementations relying on the wrong output, I see that the output has been changed before in #39924.

Please also respond in #33093 that you are not going to fix the bug (instead of insisting on that current behavior is correct) at the same time when the documentation is updated.

Is there a feature request to make a new RFC 2253/RFC 4514 compatible Go library in the future?

@bjanders
Copy link
Author

bjanders commented Apr 23, 2022

In your defence, though, the pkix.Name documentation says:

Name represents an X.509 distinguished name. This only includes the common
elements of a DN. Note that Name is only an approximation of the X.509
structure. If an accurate representation is needed, asn1.Unmarshal the raw
subject or issuer as an RDNSequence.

which is good, but that also could contain further detail on what that approximation is. It's when you read the Name.String() documentation that you get distracted by the "roughly RFC 2253 syntax". It should say something along the lines of:

borrows concepts from RFC 2253, but does not output a proper RFC 2253 string. If you need an accurate distinguished name representation, then asn1.Unmarshal the raw subject or issuer in a custom String() function. This library implementation prints the RDNs in a fixed order regardless of the order they are in the certificate, and only the following attributes are recognized and processed in order: [list attributes]. The remaining unrecognized attributes are placed last regardless of their position in the DN. Multivalued RDNs are not handled properly and instead printed as separate single valued RDNs. If the attribute is unknown (see list above) then it is printed as an OID and always hex-encoded regardless of the underlying type. For example UTF8 strings are hex encoded even though it could be represented as a string.

This method should not be used if you want to interoperate with other software that expects RFC 2253 compliant strings, or if you need a string that represent the structure of the X.509 issuer or subject name.

I did not yet fully verify that the above description is accurate, but it should be something along those lines.

@seankhliao seankhliao added this to the Unplanned milestone Aug 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants