Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: bit and byte allocation in structures #29650

Closed
viper10652 opened this issue Jan 10, 2019 · 8 comments
Closed

proposal: Go 2: bit and byte allocation in structures #29650

viper10652 opened this issue Jan 10, 2019 · 8 comments
Labels
FrozenDueToAge LanguageChange Proposal v2 A language change or incompatible library change
Milestone

Comments

@viper10652
Copy link

Context:
In data communication protocols there often is a structure or a record where fields can be less than 8 bits or fields that can be multiple bits and have very specific offsets.

C and C++ have bitfields and struct types, but the standard does not guarantee the byte alignment. The compiler is free to stuff the record with additional bytes according to the processor data bus size. Enforcement of stuffing can be controlled via compiler options but is not portable.

The Ada programming language provides record data types where exact bit and byte offsets can be specified.

Proposal:

type PACKET record big-endian {
f1 byte offset 0
f2 bitfield offset 1 {
bf1 bits 3 offset 0 // referenced as f2.bf1
bf2 bits 1 offset 3 // bit offset refers to the n-power value, i.e. mask for bf2 is 2^3
bf3 bits 4 offset 4 // offset of bitfield refers to lsb, other bits can be inferred from bits value
}
f3 uint32 offset 2
f4 range 0x0001 .. 0x0100 offset 6
}

Considerations:
1)
allow specification of endianness, e.g. big-endian or little-endian

compiler should statically check for field overlaps

  1. compiler should give warnings when the definition leaves gaps between fields

we should consider variants in record definitions

E.g.
type PTYPE enum {P1 = 1, P2 = 4, P3 = 8}
type PACKET record {
ptype PTYPE offset 0
switch ptype {
case P1:
f1 byte offset 2
f2 byte offset 3

case P2:
f1 uint16 offset 2
}
}

  1. it should be possible to safely convert between types.
    E.g.
    var datastream []byte
    var err error
    var pp PACKET
    pp, err := datastream
    Conversion errors can be caused when raw values are out of range for the associated record field types
    In this case, variable pp is Nil and err specifies which field is out of bounds

This conversion should be a mapping, rather than a copy function.
i.e. after the mapping, the start address of pp is the same as the start address of array datastream

This is useful in encoding/decoding data streams to logical data without having to iterate over the record fields.

@gopherbot gopherbot added this to the Proposal milestone Jan 10, 2019
@mvdan mvdan added LanguageChange v2 A language change or incompatible library change labels Jan 10, 2019
@ianlancetaylor ianlancetaylor changed the title proposal: bit and byte allocation in structures proposal: Go 2: bit and byte allocation in structures Jan 11, 2019
@ianlancetaylor
Copy link
Contributor

Using the & and | operators with byte values is unambiguous and reliable, and you can write methods to use them to access specific data fields. Those methods will be almost as easy to use as direct field references, and they will be just as efficient at run time. I don't think there is sufficient benefit to justify the additional language complexity.

@viper10652
Copy link
Author

Whether the bit stuffing approach is unambiguous and reliable is highly subjective. I personally don't agree.
Bit stuffing is only used because it is the only viable solution available in most languages, not because it is the best method.
Only Ada programmers have had the opportunity to use the suggested approach. In my (10 year experience) experience is it is more readable and more unambiguous than bit stuffing, and much less chance for error.
There is nothing preventing you from changing the wrong bits in the bit stuffing approach (other than each time checking dynamically which costs computational cycles).
The record-approach makes it impossible to accidentally change the wrong bits.
I can't tell you how many bugs I fixed involving a screw-up with the mask value and the shift operations needed in the bit-stuffing approach.
When I worked in Ada using the record approach, we never had any issues with inadvertent bits being changed.

Lets say, we want to change the value of f2.bf3 to the value 0x4 (0100)
Look at the following snippet and see which bits are being set:

// bit stuffing
var msg1 []byte
msg1[1] := (msg1[1] & 0x0F) | (0x4 << 4)

or simply assigning and changing each field with the proper value without needing masks and shift operations
// record approach
var msg1 PACKET
msg1.f2.bf1 := 0xFF
msg1.f2.bf2 := 0x1
msg1.f2.bf3 := 0x4

@ianlancetaylor
Copy link
Contributor

Well, but you wouldn't write

msg1[1] := (msg1[1] & 0x0f) | (0x4 << 4)

You would write

type Packet [16]byte // or whatever

func (p Packet) BF3() { return p[1] & 0xf0 }
func (p Packet) SetBF3(v int) { p[1] = (p[1] & 0x0f) | (v << 4) }

Then when you want to set the field you write

msg.SetBF3(4)

@viper10652
Copy link
Author

The stated problem does not go away by encapsulating it in a function, i.e. you can still screw up the implementation of SetBF3() and the compiler would still not be able to catch unintended changes in the adjacent bit fields.

My proposal allows people not having to think about the problem in terms of binary arithmetic but approach it from a data driven design approach instead and specify things for what they are:
changing a predefined number of bits at a specific offset within a byte without having to worry about the adjacent bits,
instead of figuring out an arithmetic equation that will (hopefully) result in changing the right bits (but no way for the compiler to prevent you from screwing it up)

@josharian
Copy link
Contributor

If you’re worried about screwing up the implementation, you could use code generation. It’s a good candidate for a codegen tool, I’d say.

@beoran
Copy link

beoran commented Jan 15, 2019

While this feature would make safely handling bits easier, exact struct layout is technically not possible on RISC architectures where you will get alignment errors from the CPU. Only Intel architecture x86 and ai64 accept unaligned access at a performance penalty. To make this feature work on all platforms, the compiler would have to change all access and changes to such bit fields to per-field | and | and ^ operations. So, this feature would largely end up being syntactic sugar that also hides a performance penalty.

@vdobler
Copy link
Contributor

vdobler commented Jan 22, 2019

If you really screw up the implementation of these encapsulations than the tests would fail and you would fix the implementation. You might detect a problem a tiny bit later in the development cycle but still early enough.
I'm afraid of seeing too many premature bit-packing "optimizations" if the language itself supports this "feature".

@ianlancetaylor
Copy link
Contributor

There are few cases where this kind of code is required, and there are existing ways to handle those cases as discussed above. When using hardware it can actually be clearer to use & and | notation as that tells you the precise values that will be written out, whereas with bitfields you have to understand the order of the bits within the larger fields (which in C varies by architecture). And when setting a bitfield it is unclear what happens to other unspecified bits in the overall byte. This all means additional language complexity, for a feature that few people need to use. Thanks for the suggestion, but the benefit is not worth the cost.

@golang golang locked and limited conversation to collaborators Jan 22, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge LanguageChange Proposal v2 A language change or incompatible library change
Projects
None yet
Development

No branches or pull requests

7 participants