Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: capability based security via stateless packages #23267

Closed
wora opened this issue Dec 28, 2017 · 55 comments
Closed

proposal: Go 2: capability based security via stateless packages #23267

wora opened this issue Dec 28, 2017 · 55 comments
Labels
FrozenDueToAge LanguageChange Proposal v2 A language change or incompatible library change
Milestone

Comments

@wora
Copy link

wora commented Dec 28, 2017

Gocap

Gocap proposes a simple way to introduce capability-based security to Go language. It allows a Go application to safely import or load untrusted third-party libraries with little security risk. For example, third-party data encoding or decoding libraries.

Problem

Most programming languages don't provide security protection within a single progress. In a word, every piece of code is running under the privilege of the current process. It means a third-party library can easily steal sensitive data and exfiltrate the data elsewhere. It also means a third-party library can issue direct attack against the surrounding environment, such as sending malicious requests to other processes or servers.

A few languages, such as Java and C#, do support in-process security protection, originally introduced to run Java applets. The basic model is to use a complicated ACL model based on the context, such as the call stack. However, the user experience is generally very poor and hardly anyone uses it in real production environment.

In the modern programming environments, using third-party libraries is an inevitable reality, especially with open source software and package management. Developers use convenient tools, such as maven install, to download and use third-party libraries without much thinking.

Currently, the only safe way to run third-party libraries is to use sandbox, such as browsers, or use VMs. These approaches have very high runtime overhead, and it is typically impractical to break a single application into multiple sandboxes or multiple VMs.

If there is a simple way to provide security protection within a single process, it would make building applications with third-party libraries much safer. The capability-based security model is a well known technique that provides simple and resilient security protection in various contexts, see https://en.wikipedia.org/wiki/Capability-based_security. Many people think it is a reasonable choice for providing security at language level. This proposal (Gocap) suggests a simple way to implement capability-based security for Go language.

Proposal

This proposal introduces a new concept to Go language: stateless package, which has the following definition:

  • A stateless package must not have any global variable.
  • A stateless package must not access state outside the current program, such as reading arbitrary memory, read current time, issue system calls.
  • A stateless package can only import other stateless packages.

The stateless package can be expressed with the following language syntax by reusing const keyword:

package const math

Because a stateless package doesn't have any state, it can only operate on inputs and generate output, and the inputs become the security capability. Conceptually, a stateless package is equivalent to a complicated pure function. However, since Go is an imperative language, a stateless package can still destroy the inputs and cause serious damage. It requires the caller to protect the inputs using primitive values, data copies, or immutable interfaces.

The stateless package prevents access to stateful packages, such as io and network, therefore a stateless package can not read user's home directory or open a network connection. This reflects the fundamental design principle of capability-based security. Opening a file is a capability, and a stateless package should not have any capability besides function inputs.

Implementation

The implementation can be done with minimum change to Go linker. The linker needs to verify a stateless package doesn't have any state, e.g. global variable, and only depends on other stateless packages. The stateless property can be a boolean flag on the package definition, so the linker validation can be done quickly.

Constant Support

One tricky problem with the proposal is constant support. In order to make core libraries usable by stateless packages, the core libraries themselves must be stateless, such as unicode. This requires Go compiler and linker to support complicated constants, including structs, arrays, maps and more. This functionality is equivalent to C++ constexpr. These complicated constants will be put into read-only data segment at runtime, so they are not mutable by any runtime code after the package is loaded.

This part is by far the most complicated work for this proposal. Had Go supports complicated constants, one could implement Gocap using standard Go with very minor modification.

Benefits

If this proposal is implemented by Go, most third party libraries can be designed and implemented as stateless packages. They can be safely used by browsers and servers. One real world example is data transcoding library. Google runs many services that process user data in various formats, such as photos and videos. This requires use of many third party libraries. It is impossible to audit every such library. So the only way to run them is to use either sandbox or VM, plus infrastructure to manage them.

With stateless package, any data transcoding can be handled by a simple interface implemented by stateless packages, such as:

Transcode(Reader, Writer, Config) err

With proper implementation of Reader and Writer, such an interface has very little security risk regardless who implements the interface.

The same design principle allows safe loading of dynamic libraries, go get third party libraries, runtime scripting, web apps, and other use cases.

Summay

Gocap proposes to use stateless package to implement capability-based security for Go language. It provides great security and usability for wide range of use cases. With proper constant support, this proposal requires very minimum change to the Go language and runtime.

Questions

Where this proposal came from originally?

This proposal was originally brought up internally at Google well before Go was released to public. It was intended to let Go run third-party code safely and offer significant usability advantage to Java. However, security was not a focus for Go language. The lack of constant support makes this proposal infeasible.

What is a concrete use case for this proposal?

YouTube uses many third party libraries for video transcoding. Because such libraries change often, it is impossible to audit the source code. If a third party library is stateless package, you can simply call it like Transcode(input []byte, settings map[string]string) []byte, and there is no security risk besides cpu and memory cost.

Does this proposal depend on immutable parameters?

No. This proposal does not depend on immutable parameters, but it would greatly benefit from immutable parameters.

Without immutable parameters, a stateless package may destroy input values, but it can be handled via better library design, similar to Java immutable collections. Supporting immutable parameters requires significant language and library changes. I don't think the benefit outweighs the cost.

In order to provide security, we need deep immutability, which requires every function to correct mark const for each pointer parameter. The C++ style const does not offer much security, and forces workarounds using mutable keyword.

Why use untrusted code in a secure application?

We should not trust any code downloaded from internet, but we cannot live without them. Most third-party libraries can and should be stateless package, and we don't even need to worry about their security risk.

Why is security important for Go?

For large organizations, such as healthcare and military, they cannot trust any third party libraries they have to use. It is a huge burden to design a security solution to this problem. Most people simply live with the problem because there is no other choice.

@davecheney
Copy link
Contributor

I don’t think this proposal will be successful for two reasons

  1. It does not prohibit the use of the unsafe package, so provides no protection from untrusted code from mutating anything it wants.
  2. It is a halfway house to real immutability. This proposal doesn’t rule out immutability, and doesn’t provide any solution for this feature (see rsc’s blog post) so it is likely that some form of immutability or const would be added to the language at some point, leaving stateless packages as a strange detour which is no longer necessary.

I think this proposal should be rejected and efforts placed instead into integrating one form of const or immutability into Go 2.

@wora
Copy link
Author

wora commented Dec 28, 2017

@davecheney I like to understand your comment better.

Can you explain how "the usage of the unsafe package" allows mutating anything, like https://golang.org/src/unsafe/unsafe.go? I was involved with the Java unsafe package many years ago. I am quite confident on designing an unsafe package that is secure and usable by stateless packages. Whether to mark unsafe as stateless depends on how much time we want to spend on refactoring existing libraries. This is an engineering time issue, not a design issue.

Functional languages, such as Haskell, offer no real security in practice. Rust supports const, but the core libraries end up having massive use of unsafe, while the Rust unsafe is significantly more powerful than Go unsafe. The immutability offers protection for input values, but it doesn't provide real security in general practice.

One way to solve this type of design decision is to choose several key use cases, and we can compare the proposed developer experience and runtime architecture side by side. Let's not assume that immutability is the real and better solution than stateless package without sufficient data and comparison. One reason that functional language is not widely adopted is the practicality.

@rsc probably still remembers my original proposal from years ago. I am happy to hear his thoughts.

@jaekwon
Copy link

jaekwon commented Dec 28, 2017

Can you explain how "the usage of the unsafe package" allows mutating anything, like https://golang.org/src/unsafe/unsafe.go? I was involved with the Java unsafe package many years ago. I am quite confident on designing an unsafe package that is secure and usable by stateless packages. Whether to mark unsafe as stateless depends on how much time we want to spend on refactoring existing libraries. This is an engineering time issue, not a design issue.

@wora I don't understand, the purpose of unsafe is to defeat the type system. If a malicious library creates a [128]byte and fills it with the right information, and uses unsafe to turn it into say a SecretStore, that would be obviously unsafe. That the module unsafe is stateless doesn't really matter.

Often, a module is stateless, but it has NewXYZ() functions that are the real entrypoint to using the module. So these modules could be marked as "pure", yet all the side effects would reside in the &struct{}'s returned from the module. So I don't see how it really helps the developer.

I could also use stenography to communicate to "pure" modules. Something like this:

type MyStruct struct {
  // all fields are private
}
func PureFunc(s *MyStruct, output io.Writer) {
  if fmt.Sprintf("%p", s) % 10000 == 0 {
    output.Write([]byte("<script .../>"))
  }
}

I could do the same for your Transcode example by chewing the cable at the right time:

func Transcode(r Reader, w Writer, c Config) err {
  b := make([]byte, 2048)
  for {
    n, _ := r.Read(b)
    if n == 1337 {
      w.Write([]byte("<script .../>"))
    }
    ...
  }
}

So, it's still possible to have lurking exploits with stateless functions. It appears to me that this "stateless" module feature doesn't really add any security, but instead it misleads the programmer to think that it is more secure/deterministic than it actually is.

With the alternative const-mutable type proposal, we can remove module-level vars entirely, but modules could still have mutable state. Disallowing module-level vars which is currently being abused seems more pressing.

@crvv
Copy link
Contributor

crvv commented Dec 28, 2017

I don't understand why this is a "security" proposal.
Can you tell me what will be forbidden by this "stateless package"?

@ianlancetaylor ianlancetaylor changed the title Proposal: Gocap - A simple way to introduce capability-based security to Go language proposal: Go 2: capability based security via stateless packages Dec 28, 2017
@ianlancetaylor ianlancetaylor added v2 A language change or incompatible library change LanguageChange labels Dec 28, 2017
@ianlancetaylor
Copy link
Contributor

Analyzing a proposal for enhancing security requires a threat model. We have to understand what attacks we are protecting against. I think that needs to be clarified here.

For example, if the threat model is "an untrusted third party library might extract security keys and send them to an attacker," then clearly untrusted libraries may not be permitted to import "unsafe". A package that imports "unsafe" can examine anything in memory.

It's not clear to me why preventing a third party library from having mutable global variables provides any additional security. For example, even without mutable global variables, a package could open a network connection to an external database and use it as a mutable store.

@wora
Copy link
Author

wora commented Dec 28, 2017

@ianlancetaylor The threat model is global access, such as reading a file in user's home directory. For a stateless package, the worst thing it can do is damaging the inputs, but it cannot read or write anything else.

I must misunderstood the design of current Go unsafe package. I thought unsafe allows you to turn a Go pointer into uintptr, but I didn't expect it can do the other way around. I will remove reference to unsafe package, since it is not part of my proposal anyway.

The io and network packages are stateful, therefore a stateless package cannot possibly import them. I will update my proposal to mention it explicitly. A network connection has a file descriptor, which is a global variable by nature.

@wora
Copy link
Author

wora commented Dec 28, 2017

I have removed reference to unsafe package in my proposal, please reread my original proposal above. Sorry for the confusion.

@ianlancetaylor
Copy link
Contributor

Thanks. The threat model still seems a bit underspecified to me. You say that the os and net packages have state, but they don't have any global variables (at least, not any important ones) so I still don't clearly understand the relationship between mutable global variables and the property you are after.

If you don't want to permit a library to do something external to the program, you can ban importing the unsafe and syscall packages, and any packages that import those packages. That will leave you with only pure computation. In that scenario, what is the threat of mutable global variables? And what in particular is the threat of importing other, trusted, packages that themselves use mutable global variables?

It's not clear to me that permitting only pure computation packages is very useful. How many third party libraries that do only pure computation are actually usable? I understand your example of transcoding libraries, but I have to imagine that that is a pretty small subset of the libraries out there. Are there are other examples?

@wora
Copy link
Author

wora commented Dec 28, 2017 via email

@pciet
Copy link
Contributor

pciet commented Dec 28, 2017

Read-only types proposal: #22876

Ian’s suggestion of inspecting the third-party libraries for improper imports (like a transcoding library importing net/http) as part of the build process seems like the right solution for this problem to me.

Dave shows how to do this with go list: https://dave.cheney.net/2014/09/14/go-list-your-swiss-army-knife

@ianlancetaylor
Copy link
Contributor

@wora Thanks for the explanation. I now think that I don't understand what you are proposing. What is the exact definition of a "stateless package," other than "it may only import stateless packages?"

You mention the unicode package as though it is a problem; in what way is the current unicode package not already a "stateless package?"

@ianlancetaylor
Copy link
Contributor

Another question: what do global variables have to do with this? If we permit a "stateless package" to have global variables, what threat are we permitting that is not otherwise available?

@wora
Copy link
Author

wora commented Dec 28, 2017 via email

@ianlancetaylor
Copy link
Contributor

It can not access any global state outside the process, such as current time, file descriptor, syscalls.

Since a goal of this proposal is automatic determination that a package is safe, can you make that statement precise?

Or, it's possible that that statement is unnecessary, and the constraints are satisfied by

  1. no global variables
  2. only import safe packages

It would be possible to audit a set of existing Go packages to see how many could exist within those constraints. If the percentage is small, it seems unlikely that this proposal would be useful in practice.

@wora
Copy link
Author

wora commented Dec 29, 2017

I updated the original proposal to make the spec more precise, but it likely needs more revision to become a real spec.

My intention is to force library developers to declare whether a package is stateless and Go linker can easily validate the declaration.

In theory, we can automatically mark a package is safe or stateless. The problem is a package can become unsafe suddenly by introducing a global variable or adding a log statement. That would trigger cascaded build failures all over the places.

I don't expect many existing Go packages are stateless, but new packages can be easily developed as stateless packages, which can be used in secure environments.

@ianlancetaylor Your question does bring up a critical point. The existing Go library design mixed stateful and stateless elements together. For example, io.Reader is a harmless interface, since it is defined inside io package, it can not be used by a stateless package.

We can change the spec to allow using type definitions in unsafe packages, but not allowing any calls to unsafe packages. This essentially move the stateless concept to function level instead of package level. It allows partial reuse of existing packages, at the cost of much complicated safety tracking system (both human cost and linker complexity).

Overall, I think my original proposal would offer an easy-to-use security model for Go, especially for new libraries. However, refactoring existing libraries may still be costly.

PS: we could rename the concept stateless to safe and have a more liberal definition of what is safe if necessary.

@crvv
Copy link
Contributor

crvv commented Dec 29, 2017

YouTube uses many third party libraries for video transcoding. Because such libraries change often, it is impossible to audit the source code. If a third party library is stateless package, you can simply call it like Transcode(input []byte, settings map[string]string) []byte, and there is no security risk besides cpu and memory cost.

I hardly heard of a video transcoding library written in Go.

Besides, video transcoding often needs something like DirectX, CUDA and OpenCL.
I don't think a library can use them without syscall, unsafe or cgo.

Maybe math could be a stateless package, but many functions in math is written in assembly, which is obviously unsafe.

@wora
Copy link
Author

wora commented Dec 29, 2017

Why do you think math is not unsafe? All codes run as machine instructions, either produced by compilers or by humans. Compilers are also written by humans. By the end of the day, you have to trust some humans. This proposal implicitly trusts Go compiler, linker and core libraries.

If we don't trust Go compiler, linker, and core libraries, the standard solution is sandbox or VM, which are widely adopted. It is a fine choice that Go leaves the security problem to sandbox and VM like other languages. This proposal is a usable solution, but I am not arguing that Go must provide in-process security.

Conceptually, the security boundary can be drawn at function level (Rust), package level (this proposal), process level (sandbox), OS level (VM), machine level (dedicated instance), data center level (on-premise and Gov Cloud), regions, or even country level (Great Firewall). I am pretty open all these choices.

@crvv
Copy link
Contributor

crvv commented Dec 29, 2017

I don't think math is unsafe.
I mean assembly is unsafe.

My point is:
Assembly, C and GPU are often required to implement a pure computation library(a stateless package), but they will be forbidden by this proposal.

Video transcoding and math are examples.
Another example is AES. Constant-time algorithm is implemented by assembly in Go stdlib.

@wora
Copy link
Author

wora commented Dec 29, 2017 via email

@pciet
Copy link
Contributor

pciet commented Dec 29, 2017

Auditing third-party libraries is a manual work. It cannot be done at
scale, and it is limited by human errors.

An automated auditing tool is possible for these security considerations.

The Go standard library packages do mix a lot of behavior so locking out entire packages because of something like an unsafe or syscall package import may not be reasonable for your goals.

But we are working with versioned standard library APIs, so an option is to audit the standard library and require third party libraries to follow a list of allowed standard types, functions, and global vars. Without the standard library only safe computation can be done.

go list already has logic to determine imports recursively, and additional logic would lock out specific standard library packages completely or have a blacklist or whitelist of functions/types/vars in a package. There would need to be a requirement of Go-only source.

Auditing the standard library for usage that matches your security needs is not a huge task.

Perhaps this library auditing tool could be part of the standard language distribution.

@as
Copy link
Contributor

as commented Dec 31, 2017

I agree with everyone that suggested auditing because this seems like a compile time problem that wants to be solved at runtime. Has it been demonstrated that it is impossible to derive state in a function call by observing the runtime indirectly?

I hardly heard of a video transcoding library written in Go.
Besides, video transcoding often needs something like DirectX, CUDA and OpenCL.
I don't think a library can use them without syscall, unsafe or cgo.

There are close-source codec implementations written in pure Go. They don't need any of these things.

YouTube uses many third party libraries for video transcoding. Because such libraries change often, it is impossible to audit the source code.
Transcode(input []byte, settings map[string]string) []byte

Video transcoding is not stateless. Even a decoder without package state sounds like a logistical nightmare. That call to Transcode needs to initialize 2 decoders and 1 encoder every time it's called and would need to process a complete elementary stream. They are also incompatible with this proposal because you need to be able to measure time to encode a conforming stream.

@wora
Copy link
Author

wora commented Jan 2, 2018

Whether transcoding is stateless depends on the implementation. I don't think a decoder without state would be a nightmare. Such problem can be solved by minor API change.

Transcode(reader, writer, timer, config) err

To measure the time, the caller can pass a timer object to the transcoding function, see above. If we decide to use capability-based security, the library needs to be written in a way that is compatible with the model. It would be a straightforward experience in a language that is based on capability-based security from day one. Fitting any security model to an existing language would be extremely hard, since all existing libraries don't conform to the model and developers would not be happy to think things in different ways within the same language.

FWIW, I don't think a real transcoder would use package state. That would prevent transcoding two streams at the same time. Most libraries should support parallel data processing.

Comparing to browser sandbox and VMs, it is much simpler and more efficient to use capability-based security model within a process. However, it would still be significant amount of work for library authors to adopt the model.

@as
Copy link
Contributor

as commented Jan 2, 2018

It appears we are not on the same page. Maybe the misunderstanding here comes from a lack of formalization of what it means for a package to have state. Is it:

  • No package level variables?
  • No exported package level variables?
  • No object state?
  • No exported object state?

The discussions seems to imply that stateless means that it's impossible for some func to compute any information outside the provided interfaces passed into it--including previous calls to itself, because of this statement:

The reason to prevent global variable is to prevent one function invocation stores some value into a global variable, and another function invocation read the value back.

It makes very little sense to disallow package scoped variables but allow objects to have state when these objects are part of the untrusted package. If the function can tell that its been executed before, that function can derive state from within its object (in this case the function is a method).

Transcode(reader, writer, timer, config) err

A video encoder generates output by compressing spacial and temporal information in the signal. They don't just need a timer, they need to know what was encoded before. To do this without propagating error, modern encoders contain most of the decoder as well, and store a partially-decoded state of what the encoder encoded. The elementary stream itself is stored in a container format, which provides metadata for the elementary stream so that a decoder can decode that stream. At the lower levels of the stream are many similar state machines. To make matters worse, the decision of which machine to use depends on not only the profile/level of the encoder (which I assume you could put in the config), but is a characteristic of the stream itself. This crosses off the possibility of removing the object state. The possibility of exporting all of these state carriers is as function parameters is slim, unless you want a package function with hundreds of parameters (many of these states exist in warm or hot loops, but speed/simplicity/security, pick two or pick security).

At the most abstract level, we can explore the security guarantees of a video encoder by simply creating a wrapper around ffmpeg, which provides all of these complex pieces in a command line executable. If we call ffmpeg to transcode a video with the above function signature, what security guarantees does that offer? Well, we are calling an executable that spawns as a process and it can do whatever the operating system permits.

But what if we implement all of the pieces of ffmpeg in Go? Ok, assume we did that. What security guarantee do we have now? No memory corruption, for one, but Go already provides this for free. Even then this hypothetical ffmpeg still has its own state, whether it stores it internally in a process or an object or a package level variable doesn't matter. My original point is that the video transcoder example is impossible by design if you remove object state.

One benefit of this proposal is that the hypothetical ffmpeg can't import net and transfer the video to an adversary. This seems like a side channel security issue that is best handled by the operating system. It also ignores a relevant issue: there's nothing that can be done to prevent an adversary from adding a special state to the go ffmpeg package that replaces the entire stream with malicious or corrupted output based on what the encoder or decoder has seen before as a state. This violates the invariant that a function can't make a decision to be malicious based on prior input. In fact, there is not way to verify what behavior is malicious in this context.

The proposal seems to target these cases where the developer crosses an imaginary trust boundary based on package scope, and interacts with some data outside the package. It presents no formal definition of security and relies on general assumptions of what defines security, and this makes the current proposal very difficult to reason about.

@wora
Copy link
Author

wora commented Jan 2, 2018 via email

@pciet
Copy link
Contributor

pciet commented Jan 3, 2018

The example by @crvv doesn’t look like global state to me. I interpreted the idea as meaning global vars that can be edited by external packages. Keeping state in a caller-owned var is stateless for this context.

For example, package os has Stdin, Stdout, Stderr, and the shell args all as editable global vars which may be a security problem in some applications, such as a malicious library redirecting sensitive Stdout information to a network connection. Many packages have global error vars that may be changed to introduce another error handling behavior that uncovers a security hole found only by an attacker.

My mental model is that syscall, os, and net packages directly access states outside the current process, so they are NOT stateless. Therefore, a stateless package can not import them. I wonder what would be the best way to phrase such intention.

These packages have effects outside of the program’s address space, which can only be done with operating system calls (syscall).

The primary use of syscall is inside other packages that provide a more portable interface to the system, such as "os", "time" and "net". Use those packages rather than this one if you can.

@as
Copy link
Contributor

as commented Jan 3, 2018

In the paragraph you quoted, a state is any bit of data outside the current user address space.

The environment variables, file descriptors, GDI handle buffers, current working directory handle, thread environment block, process name, loaded module list, etc all live or are referenced by the process environment block, which is in the "current user address space". I don't know what a user address space is, but I do know that executing:

	MOVQ	0x60(GS), AX
	MOVQ	AX, ret+0(FP)

In any Windows amd64 process gives you a pointer to the PEB in rax, and the ability to modify this large data structure and its associated bits.

@ianlancetaylor
Copy link
Contributor

Whatever a stateless package is, I think it's clear that it must not contain assembler code, and must not import the unsafe package.

@wora
Copy link
Author

wora commented Jan 3, 2018

Let me clarify the porposal a bit.

The proposal has at least 4 required conditions, more if we found bugs.

  • This proposal provides capability-based security for Go language. It doesn't cover assembler code or foreign function calls. These things should be handled by manual code audit and whitelisting.
  • Cannot have global variable.
  • Cannot access state outside the user space.
  • Cannot depend on unsafe package.

We should not debate on individual condition. Obviously, individual condition is not sufficient to provide any security. I believe these 4 conditions together are sufficient to offer capability-based security for Go. However, security researchers may find flaws with the proposal and I am happy to learn and improve on it.

Whether Go needs a security model or whether capability-based security model is preferred choice is a separate topic. I don't plan to address them in this issue.

@as
Copy link
Contributor

as commented Jan 4, 2018

Necessary conditions for security are clear definitions of security and an outline of assumptions those definitions rely on. What do you mean by user space? Are you talking about user vs kernel mode or something completely different?

@wora
Copy link
Author

wora commented Jan 4, 2018 via email

@josharian
Copy link
Contributor

As I believe @randall77 pointed out elsewhere, if you're excluding assembly because the author can use it to do nefarious things, you probably also need to exclude the go keyword, since the author can use it to create intentional data races, which can be used to do anything that assembly could do.

@wora
Copy link
Author

wora commented Jan 11, 2018

I agree. go keyword would create data races, so it can't be used in a stateless package.

@as
Copy link
Contributor

as commented Jan 11, 2018

Because writing to the same memory location repeatedly may affect the value of adjacent memory, equals shouldn't be used in a stateless package either.

@wora
Copy link
Author

wora commented Jan 12, 2018

The row hammering should be hardware bug. To avoid such problem, you can't even run two VMs on the same machine. You can only run one trusted application per machine, which is not a viable business.

With go routine, you can return an array from a function, and use a go routine to keep changing the array data. That would be a huge security risk.

@jaekwon
Copy link

jaekwon commented Jan 20, 2018

Finally figured out why not having module-level state is good...

I was working on github.com/tendermint/go-wire (sdk2 branch), when I realized that any call to wire.RegisterConcrete(...) could have devastating effects in unrelated decoding logic. In short, callers who rely on the global wire.(Un)Marshal*() are at the mercy of anyone calling the global wire.Register*() functions. (which changes the way registered types are marshaled and unmarshaled, which can break binary compatibility). One way to solve this problem is to get rid of the global implicit &wire.Codec{} instance altogether, so that all users of the go-wire module are forced to create their own (hopefully unexposed) codec instances.

@as
Copy link
Contributor

as commented Jan 21, 2018

The row hammering should be hardware bug. To avoid such problem, you can't even run two VMs on the same machine. You can only run one trusted application per machine, which is not a viable business.

No, you can run an unbounded number of trusted applications per machine. They're trusted for a reason, aren't they? The question is whether this proposal (which frames security into the narrow tube of information hiding) a good substitute for traditional code review. Row hammering and friends are practical issues for modern day processors, ignoring them is academic ignorance: making it work in theory is useless if it can't work on real systems with real security requirements.

In short, it's not that module-level state is bad per-se, but I think what's important is that the way we want to use module-level state leads to insecure logic, for the same reason why having module-level vars are bad.

The proposal still makes an assumption that executable code can protect itself in the same process space. What modern system supports this? I feel like this proposal is an attempt to make it easier for people to incorporate random third party code in their go projects rather than achieving security.

@jaekwon
Copy link

jaekwon commented Jan 21, 2018

@josharian: As I believe @randall77 pointed out elsewhere, if you're excluding assembly because the author can use it to do nefarious things, you probably also need to exclude the go keyword, since the author can use it to create intentional data races, which can be used to do anything that assembly could do.

I get the data-race problem, but the solution is to fit more into 64 bits, and/or, to use 128 bit processors.

Because writing to the same memory location repeatedly may affect the value of adjacent memory, equals shouldn't be used in a stateless package either.

Seems like a problem to be solved at the RAM level. Maybe replication can help as well.

@jaekwon
Copy link

jaekwon commented Jan 21, 2018

I was working on github.com/tendermint/go-wire (sdk2 branch), when I realized that any call to wire.RegisterConcrete(...) could have devastating effects in unrelated decoding logic. In short, callers who rely on the global wire.(Un)Marshal*() are at the mercy of anyone calling the global wire.Register*() functions. (which changes the way registered types are marshaled and unmarshaled, which can break binary compatibility). One way to solve this problem is to get rid of the global implicit &wire.Codec{} instance altogether, so that all users of the go-wire module are forced to create their own (hopefully unexposed) codec instances.

OTOH, on some modules, you'll want to use a var or mutable-const to memo-ize computations for performance... I just encountered this while trying to use a stateless go-wire module. The var just gets pushed out to external modules, but they're still there, even if they are used in an idempotent way.

@wora
Copy link
Author

wora commented Jan 21, 2018 via email

@josharian
Copy link
Contributor

I get the data-race problem, but the solution is to fit more into 64 bits, and/or, to use 128 bit processors.

I don't understand how "fitting more into 64 bits" fixes the problem of untrusted code using data races to achieve pernicious goals. Or, for that matter, how it fixes any data race problems at all.

@wora
Copy link
Author

wora commented Jan 21, 2018 via email

@jaekwon
Copy link

jaekwon commented Jan 22, 2018

I don't understand how "fitting more into 64 bits" fixes the problem of untrusted code using data races to achieve pernicious goals. Or, for that matter, how it fixes any data race problems at all.

There is a particular data-race problem where three go-routines can coordinate to create an interface var value that would otherwise be impossible without the usage of the "unsafe" module. Since our CPUs are 64-bit, but Golang interface var's are 128-bit on a 64-bit machine, 2 goroutines with a data-race can cause an unsafe value to be picked up by the 3rd goroutine.

But wait, I think we can just run the program with the race-detector... it'll be slower, but it'll be correct, as in, this data-race issue would be detected at run-time.

The other solution is to pack more into 64-bits, as in, be able to represent interface var's as 64-bits, which would require (say) allocating some number of bits for the itable, and some 50+ bits to point to the heap... This would resolve the data-race issue on 64-bit machines.

@jaekwon
Copy link

jaekwon commented Jan 22, 2018

In any case, avoiding exposed global state seems like it's still generally good practice, so maybe that's all we need... to make illegal the exposing of var's at the module level. And through blogs and so on communicate why global state is bad in a certain perspective of security analysis, where the threat model is a malicious dependency. By locking the dependency versions w/ glide (say) and scanning the dependency list for usage of "unsafe", and running w/ the race-detector (say), and avoiding the usage of module level vars (e.g. the "crypto/random.Rand" var), and only buying RAM that isn't vulnerable to loopholes, or sufficiently duplicating the logic if the loophole is sufficiently unpredictable, we reduce the surface-area for the malicious dependency to operate in, minimizing and possibly even neutralizing any damage that may result.

I think this is what @wora means by "capabilities based security". That's what I mean by it. It can be summarized with maybe a single mantra... "If a function has access to a capability as provided by the runtime or programming language, then it's because the function is implicitly allowed permission to execute those capabilities, such as reading or writing to a slice, or calling a function with side-effects". Using this as the basis for secure programming design, we can grow a programming framework that reduces points of vulnerability. Your dependencies won't be able to muck with things that they can't even access!

@pciet
Copy link
Contributor

pciet commented Jan 22, 2018

Stateless packages, no assembly, and import constraints are too big of a concession for competing public libraries and also leave the memory accessible through other means we haven’t thought about. That data race is one of likely many possible hacks when working within the process memory space.

"If a function has access to a capability as provided by the runtime or programming language, then it's because the function is implicitly allowed permission to execute those capabilities, such as reading or writing to a slice, or calling a function with side-effects"

@jaekwon, @wora, what is your source for this? Specifically process capability security, not OS or hardware capability security as described by the Wikipedia article.

@wora
Copy link
Author

wora commented Jan 22, 2018

Java and C# already solved the same problem with a more complicated and less usable solution. Browser also solved it in a different way. Obviously there is a cost for it, but I don't think it is prohibitive. I haven't use any global variable in my code for many years, and it works just fine.

I worked on JVM before. I am confident to say capability-based model would be significantly simpler and more usable than the Java security model. It also encourages developers to have better design for their software.

@erights is an expert in this area. The E programming language uses capability-based security, see https://en.wikipedia.org/wiki/E_(programming_language).

@as
Copy link
Contributor

as commented Jan 24, 2018

@jaekwon Your assumption is wrong for multiple reasons. There is no guarantee types of a certain size are inherently safe to update concurrently. Even if they were, interface types are not the only types, and not every such type would be aligned on a memory location that would guarantee atomic updates.

@jaekwon
Copy link

jaekwon commented Jan 28, 2018

@jaekwon Your assumption is wrong for multiple reasons. There is no guarantee types of a certain size are inherently safe to update concurrently. Even if they were, interface types are not the only types, and not every such type would be aligned on a memory location that would guarantee atomic updates.

You're referring to how it's implemented now. I'm referring to how it could be implemented, since I'm more interested in the long-term prospects of the language and ecosystem.

Go2 (say) could be implemented such that it supports atomic updates for all types. It could do this by packing more into 64 bits, and it would work on commodity machines. It would probably take a performance hit.

@ianlancetaylor
Copy link
Contributor

I believe this proposal can be implemented entirely using a static analysis tool. That tool can look through the whole program and verify that a specific set of packages are safe. There is no need to change the language for this. As we gain experience with that tool, if it turns out to be very useful, then we can consider bringing it into the language proper.

@golang golang locked and limited conversation to collaborators Apr 17, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge LanguageChange Proposal v2 A language change or incompatible library change
Projects
None yet
Development

No branches or pull requests

9 participants