Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: encoding/baseXX: add Encoding.RejectLineFeeds #53845

Open
dsnet opened this issue Jul 13, 2022 · 3 comments
Open

proposal: encoding/baseXX: add Encoding.RejectLineFeeds #53845

dsnet opened this issue Jul 13, 2022 · 3 comments

Comments

@dsnet
Copy link
Member

dsnet commented Jul 13, 2022

Currently, base32 and base64 ignore carriage returns and linefeeds by default.

This behavior goes against RFC 4648, sections 3.3 which state:

Implementations MUST reject the encoded data if it
contains characters outside the base alphabet when interpreting base-encoded data,
unless the specification referring to this document explicitly states otherwise.
Such specifications may instead state, as MIME does, that
characters outside the base encoding alphabet should
simply be ignored when interpreting data ("be liberal in what you accept").
Note that this means that any adjacent carriage return/line feed (CRLF) characters
constitute "non-alphabet characters" and are ignored.

Rejection of "characters outside the base encoding alphabet" (including carriage returns and line feeds) should be the default,
unless specified otherwise by some higher-level specification (e.g., MIME).
The decision to allow \r or \n should not have been made by the base32 and base64 packages,
but rather by the users of it.

Today, base32 and base64 already ignore \r and \n by default and we can't change that,
but we should expose control over this behavior:

// RejectLineFeedscreates a new encoding identical to enc except that
// rejects the presence of carriage returns and line feeds as
// described in RFC 4648, sections 3.1 and 3.3.
func (enc Encoding) RejectLineFeeds() *Encoding
@dsnet
Copy link
Member Author

dsnet commented Jul 26, 2022

An alternative and more flexible API is (per #54054 (comment)):

// WithIgnored specifies a set of non-alphabet characters that are ignored
// when parsing the input. An empty string causes the encoder to reject
// all characters that are not part of the encoding alphabet.
// A newly created Encoder ignores '\r' and '\n' by default.
func (enc Encoding) WithIgnored(chars string) *Encoding

My original proposal would be equivalent to enc.WithIgnored(""),
while #54054 could be accomplished using enc.WithIgnored("\t\v\f \r\n").

@gopherbot
Copy link

Change https://go.dev/cl/532295 mentions this issue: encoding: support WithIgnored in base32 and base64

@dsnet
Copy link
Member Author

dsnet commented Oct 2, 2023

This feature combined with #53844 makes it possible to implement a truly bijective mapping between baseXX and binary data. This would allow the use of base32 and base64 to produce a truly canonical encoding per RFC 4648, section 3.5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

2 participants