-
Notifications
You must be signed in to change notification settings - Fork 572
const-eval: explain the final-value-byte-provenance restriction #2138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
const-eval: explain the final-value-byte-provenance restriction #2138
Conversation
Good question. We had put the restrictions on the final value of a const in the chapter on const items, but given that this applies to static items as well that'd be maybe a bit more annoying. Maybe @ehuss will have a good idea on this. |
5b019ef to
76556a2
Compare
76556a2 to
a5410c9
Compare
This comment has been minimized.
This comment has been minimized.
a5410c9 to
4948704
Compare
|
This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
4948704 to
8847baa
Compare
The text here is describing a restriction on the final value of a constant or static initializer. We could put this in the chapters for constant and static items, but then we'd have to duplicate it. At the same time, it doesn't make sense to be in the section for constant expressions since this is not a restriction on those expressions. Let's solve this, for now, by keeping it in the chapter on constant evaluation but putting it in a new "constant initializers" section. We'll move the second paragraph, which states what a compiler should do as a "quality-of-implementation" matter, into an admonition, and we'll reword this a bit to talk about what `rustc` does (but does not guarantee) to match the style we use for this in other places. We also add links to the appropriate chapters and sections for constant and static initializers.
7740678 to
632cdd3
Compare
Let's add examples and explanatory notes to clarify the restriction that the representation of the final value of a constant or static initializer must only contain bytes with provenance in whole-pointer groups. We'll add a `compile_fail` example demonstrating how storing a pointer that extends into padding creates pointer fragments in the final value, causing compilation to fail and show to work around this by explicitly zeroing the padding bytes. Let's extend the existing note about uninitialized padding bytes to provide deeper intuition about this restriction and explain how constant evaluation makes the details of typed copies observable (whether field-by-field or memory-block), how these details are not yet fully specified in Rust, and why the compiler must be allowed to reject initializers with uninitialized padding bytes to preserve future flexibility (such as always setting padding to uninitialized). Context: rust-lang/rust#148470, rust-lang/rust#148967
Let's add an admonition explaining that bytes with provenance must form a complete pointer in the correct order.
Let's add an example demonstrating that reversing the order of pointer bytes causes compilation to fail, even though all bytes are present. The compiler tracks the position of each byte within its original pointer and only accepts pointers when reassembled in the correct order. This example copies a pointer byte-by-byte in reverse order into the padding of a struct, which fails because the fragment indices don't match up to form a valid pointer. Context: rust-lang/rust#144081
In const eval, the bytes with provenance that become part of the final value must form a whole-pointer group with the bytes in the correct order. We have admonitions about this; let's also add disambiguating text to the normative language.
632cdd3 to
fde73c7
Compare
|
@RalfJung: I added some examples and admonitions with explanation and motivation. These look right to you? |
| struct Pair { | ||
| x: u128, | ||
| y: MaybeUninit<u64>, | ||
| // 8 bytes of padding at offset 24. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's only true if u128 has an alignment of 16 and u64 an alignment of 8.
| > [!NOTE] | ||
| > The bytes with provenance must form a complete pointer in the correct order. In the example above, the pointer is written at offset 20, but it requires (on 64-bit platforms) 8 bytes. Four of those bytes fit in the `y` field; the rest extend into the padding at offset 24. When the fields are initialized, the `y` bytes get overwritten, leaving only a partial pointer (4 bytes) in the padding. These 4 bytes have provenance but don't form a complete pointer, causing compilation to fail. | ||
| > | ||
| > This restriction ensures that any bytes with provenance in the final value represent complete, valid pointers. The compiler cannot support pointer fragments because it would be unable to reason about them at compile time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"valid" is a dangerous choice here. The pointer may well be out-of-bounds! That's fine as far as this particular restrictions is concerned.
| > [!NOTE] | ||
| > The bytes with provenance must form a complete pointer in the correct order. In the example above, the pointer is written at offset 20, but it requires (on 64-bit platforms) 8 bytes. Four of those bytes fit in the `y` field; the rest extend into the padding at offset 24. When the fields are initialized, the `y` bytes get overwritten, leaving only a partial pointer (4 bytes) in the padding. These 4 bytes have provenance but don't form a complete pointer, causing compilation to fail. | ||
| > | ||
| > This restriction ensures that any bytes with provenance in the final value represent complete, valid pointers. The compiler cannot support pointer fragments because it would be unable to reason about them at compile time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The compiler cannot support pointer fragments because it would be unable to reason about them at compile time.
This is incorrect. The compiler actually has no problem at all reasoning about them. However, typical binary formats like ELF do not support representing pointer fragments so we just can't produce a binary when they occur.
| > let mut m = MaybeUninit::<Pair>::uninit(); | ||
| > let ptr: *const u8 = &0; | ||
| > let ptr_bytes = &ptr as *const _ as *const MaybeUninit<u8>; | ||
| > // Write pointer bytes in reverse order into the padding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how padding causes the pointer bytes to be reversed...?
Generally this example seems unnecessarily complicated. You can just have a single field of type MaybeUninit<u64> (no padding) and store a byte-reversed pointer in there.
| > const ALLOWED: Pair = Pair { x: 0, y: 0 }; | ||
| > ``` | ||
| > | ||
| > Constant evaluation makes the details of typed copies observable: depending on whether a copy is performed field-by-field or as a memory-block copy, provenance in padding bytes might be discarded or preserved (both in the source and in the destination). Because the semantics of typed copies are not yet fully specified in Rust --- and to preserve the ability to change how they work in the future in constant evaluation (for example, to always set the padding bytes to uninitialized) --- the language allows the compiler to reject any initializer with an uninitialized padding byte. Since the compiler cannot currently guarantee that an uninitialized byte does not contain a pointer fragment without a full model of typed copies, this allowance is necessary to avoid relying on underspecified details of the language. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if we decide to go with the MiniRust semantics for typed copies, nothing about this changes. So it's not correct to say that this is "Because the semantics of typed copies are not yet fully specified in Rust".
This is the docs part for rust-lang/rust#148967.
I had no idea where to put it. Putting it in "const-expr" is odd since that is generally a compositional notion (the subexpressions of a const expression must be const expressions), whereas the limitation I am describing here only applies at the top level, i.e. for the final value of a const/static initializer.
Related