Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: encoding/csv: CSV Injection #58849

Closed
wroge opened this issue Mar 3, 2023 · 11 comments
Closed

proposal: encoding/csv: CSV Injection #58849

wroge opened this issue Mar 3, 2023 · 11 comments

Comments

@wroge
Copy link

wroge commented Mar 3, 2023

The CSV Writer does not provide protection against CSV injection. I think this should be implemented in the standard library so that you don't have to resort to external packages.

https://owasp.org/www-community/attacks/CSV_Injection

@wroge wroge added the Proposal label Mar 3, 2023
@gopherbot gopherbot added this to the Proposal milestone Mar 3, 2023
@seankhliao
Copy link
Member

What specific changes are you proposing to encoding/csv?

@wroge
Copy link
Author

wroge commented Mar 3, 2023

A cell should always be escaped when the following case occurs.

This attack is difficult to mitigate, and explicitly disallowed from quite a few bug bounty programs. 
To remediate it, ensure that no cells begin with any of the following characters:

Equals to (=)
Plus (+)
Minus (-)
At (@)
Tab (0x09)
Carriage return (0x0D)
Keep in mind that it is not sufficient to make sure that the untrusted user input does not start with 
these characters. You also need to take care of the field separator (e.g., ‘,’, or ‘;’) and quotes (e.g., ', or "), 
as attackers could use this to start a new cell and then have the dangerous character in the middle of the 
user input, but at the beginning of a cell.

Alternatively, one could provide an option to escape all cells.

Alternatively, apply the following sanitization to each field of the CSV, so that their content will be read as 
text by the spreadsheet editor:

Wrap each cell field in double quotes
Prepend each cell field with a single quote
Escape every double quote using an additional double quote

Or both.

@seankhliao
Copy link
Member

Option 1 does not look like a reasonable thing to implement for a generic package like encoding/csv. the package does wrap/escape quotes and separators, it would be sufficient for you to church for leading characters yourself.

Option 2 changes the underlying data to always begin with '

@wroge
Copy link
Author

wroge commented Mar 3, 2023

Cells containing quotes are escaped again.

"'=1+1" -> ""'=1+1""

Therefore, I cannot use this package as it is.

@wroge
Copy link
Author

wroge commented Mar 3, 2023

I think this way I can still use the package. Thanks for the effort!
An option to escape all cells would still be appreciated though!

import (
	"bufio"
	"encoding/csv"
	"fmt"
)

func NewCSVWriter(b *bufio.Writer, comma rune) CSVWriter {
	w := csv.NewWriter(b)
	w.Comma = comma

	return CSVWriter{
		Writer: w,
	}
}

type CSVWriter struct {
	Writer *csv.Writer
}

func (w CSVWriter) Error() error {
	return w.Writer.Error()
}

func (w CSVWriter) WriteAll(records [][]string) error {
	for _, record := range records {
		err := w.Write(record)
		if err != nil {
			return err
		}
	}

	w.Writer.Flush()

	return w.Writer.Error()
}

func (w CSVWriter) Flush() {
	w.Writer.Flush()
}

func (w CSVWriter) Write(record []string) error {
	for i, each := range record {
		if len(each) == 0 {
			continue
		}

		switch each[0] {
		case '=', '+', '-', '@', '\t', '\r':
			record[i] = fmt.Sprintf(`'%s`, each)
		}
	}

	return w.Writer.Write(record)
}

@seankhliao
Copy link
Member

Looking at what you've implemented this looks out of scope for encoding/csv, which implements RFC 4180.

@seankhliao seankhliao closed this as not planned Won't fix, can't repro, duplicate, stale Mar 3, 2023
@aidantwoods-1p
Copy link

aidantwoods-1p commented Apr 4, 2023

RFC 4180 does state (emphasis added):

Each field may or may not be enclosed in double quotes

It would be possible to implement OWASPs guidance if we added a new option to the CSV writer to quote all fields (overriding the behaviour of fieldNeedsQuotes.

Could we add a new option to the writer

type Writer struct {
	Comma   rune // Field delimiter (set to ',' by NewWriter)
	UseCRLF bool // True to use \r\n as the line terminator
	QuoteAllFields bool // True to quote all fields, even if not required by RFC 4180
	w       *bufio.Writer
}

Then func (w *Writer) Write(record []string) could be adjusted to do

  		// If we don't have to have a quoted field then just
  		// write out the field and continue to the next field.
- 		if !w.fieldNeedsQuotes(field) {
+ 		if !w.QuoteAllFields && !w.fieldNeedsQuotes(field) {

Either way this is RFC 4180 compliant, but it gives users the option to opt-in to quoting all fields to better encapsulate alternative delimiters that might be interpreted by certain spreadsheet programs.

@seankhliao
Copy link
Member

we declined that in #12755

@aidantwoods-1p
Copy link

The reason cited is complexity, but it would be a one line change unless I'm missing something?

@aidantwoods-1p
Copy link

The alternative for folks that want to output slightly safer CSVs is to roll a custom CSV writer, which I guess is possible but not something I'm super enthusiastic about deviating from the standard library on 😅

@ianlancetaylor
Copy link
Contributor

There are so many variants of CSV that we decided a long time ago to keep the standard library package only focused on RFC 4180, as the package documentation states. We kept the options that had already been added, but we aren't going to add any more. The package is short and easy to copy for people who need different formats.

@golang golang locked and limited conversation to collaborators Apr 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants